Máster en Data Science - Machine Learning¶
Tratamiento de Valores missing, outlier y correlaciones¶
Autor: Ramón Morillo Barrera
Dataset: Application data¶
En este notebook trabajaremos en el análisis exploratorio gráfico con el objetivo de visualizar y entender el comportamiento de las variables. Trabajaremos en el tratamiento de valores nulos o missing, outliers y estudiaremos la correlacion entre variables.
Como comentamos anteriormente, se llevará a cabo una separación estratificada en el paso de train-test split debido al desbalanceo de la variable objetivo.
Librerías¶
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.impute import KNNImputer
from termcolor import colored, cprint
import scipy.stats as ss
import warnings
import sys
from scipy.stats import chi2_contingency
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 500)
Funciones¶
sys.path.append('../src')
import funciones_auxiliares as f_aux
sys.path.remove('../src')
# Constante
seed = 12354
Importo el dataset¶
df_loan = pd.read_csv('../../data_loan_status/data_preprocessing/pd_data_initial_preprocessing.csv')
df_loan.head()
| SK_ID_CURR | COMMONAREA_AVG | COMMONAREA_MEDI | COMMONAREA_MODE | NONLIVINGAPARTMENTS_AVG | NONLIVINGAPARTMENTS_MEDI | NONLIVINGAPARTMENTS_MODE | FONDKAPREMONT_MODE | LIVINGAPARTMENTS_MEDI | LIVINGAPARTMENTS_AVG | LIVINGAPARTMENTS_MODE | FLOORSMIN_MODE | FLOORSMIN_AVG | FLOORSMIN_MEDI | YEARS_BUILD_MODE | YEARS_BUILD_MEDI | YEARS_BUILD_AVG | OWN_CAR_AGE | LANDAREA_MEDI | LANDAREA_AVG | LANDAREA_MODE | BASEMENTAREA_MODE | BASEMENTAREA_AVG | BASEMENTAREA_MEDI | EXT_SOURCE_1 | NONLIVINGAREA_AVG | NONLIVINGAREA_MODE | NONLIVINGAREA_MEDI | ELEVATORS_MEDI | ELEVATORS_AVG | ELEVATORS_MODE | WALLSMATERIAL_MODE | APARTMENTS_AVG | APARTMENTS_MODE | APARTMENTS_MEDI | ENTRANCES_MEDI | ENTRANCES_MODE | ENTRANCES_AVG | LIVINGAREA_AVG | LIVINGAREA_MODE | LIVINGAREA_MEDI | HOUSETYPE_MODE | FLOORSMAX_MODE | FLOORSMAX_AVG | FLOORSMAX_MEDI | YEARS_BEGINEXPLUATATION_MODE | YEARS_BEGINEXPLUATATION_AVG | YEARS_BEGINEXPLUATATION_MEDI | TOTALAREA_MODE | EMERGENCYSTATE_MODE | OCCUPATION_TYPE | EXT_SOURCE_3 | AMT_REQ_CREDIT_BUREAU_WEEK | AMT_REQ_CREDIT_BUREAU_MON | AMT_REQ_CREDIT_BUREAU_HOUR | AMT_REQ_CREDIT_BUREAU_DAY | AMT_REQ_CREDIT_BUREAU_YEAR | AMT_REQ_CREDIT_BUREAU_QRT | NAME_TYPE_SUITE | OBS_60_CNT_SOCIAL_CIRCLE | DEF_60_CNT_SOCIAL_CIRCLE | OBS_30_CNT_SOCIAL_CIRCLE | DEF_30_CNT_SOCIAL_CIRCLE | EXT_SOURCE_2 | AMT_GOODS_PRICE | AMT_ANNUITY | CNT_FAM_MEMBERS | DAYS_LAST_PHONE_CHANGE | HOUR_APPR_PROCESS_START | REG_REGION_NOT_LIVE_REGION | ORGANIZATION_TYPE | NAME_CONTRACT_TYPE | FLAG_OWN_CAR | CODE_GENDER | AMT_CREDIT | AMT_INCOME_TOTAL | CNT_CHILDREN | NAME_INCOME_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | REGION_POPULATION_RELATIVE | NAME_EDUCATION_TYPE | DAYS_BIRTH | DAYS_EMPLOYED | DAYS_REGISTRATION | DAYS_ID_PUBLISH | FLAG_MOBIL | FLAG_EMP_PHONE | FLAG_WORK_PHONE | FLAG_CONT_MOBILE | TARGET | FLAG_OWN_REALTY | LIVE_REGION_NOT_WORK_REGION | FLAG_EMAIL | REGION_RATING_CLIENT | REGION_RATING_CLIENT_W_CITY | WEEKDAY_APPR_PROCESS_START | FLAG_PHONE | REG_CITY_NOT_LIVE_CITY | REG_CITY_NOT_WORK_CITY | LIVE_CITY_NOT_WORK_CITY | REG_REGION_NOT_WORK_REGION | FLAG_DOCUMENT_4 | FLAG_DOCUMENT_5 | FLAG_DOCUMENT_2 | FLAG_DOCUMENT_3 | FLAG_DOCUMENT_11 | FLAG_DOCUMENT_10 | FLAG_DOCUMENT_9 | FLAG_DOCUMENT_8 | FLAG_DOCUMENT_7 | FLAG_DOCUMENT_6 | FLAG_DOCUMENT_12 | FLAG_DOCUMENT_13 | FLAG_DOCUMENT_19 | FLAG_DOCUMENT_18 | FLAG_DOCUMENT_17 | FLAG_DOCUMENT_16 | FLAG_DOCUMENT_15 | FLAG_DOCUMENT_14 | FLAG_DOCUMENT_20 | FLAG_DOCUMENT_21 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 100002 | 0.0143 | 0.0144 | 0.0144 | 0.0000 | 0.0000 | 0.0 | reg oper account | 0.0205 | 0.0202 | 0.022 | 0.1250 | 0.1250 | 0.1250 | 0.6341 | 0.6243 | 0.6192 | NaN | 0.0375 | 0.0369 | 0.0377 | 0.0383 | 0.0369 | 0.0369 | 0.083037 | 0.0000 | 0.0 | 0.00 | 0.00 | 0.00 | 0.0000 | Stone, brick | 0.0247 | 0.0252 | 0.0250 | 0.0690 | 0.0690 | 0.0690 | 0.0190 | 0.0198 | 0.0193 | block of flats | 0.0833 | 0.0833 | 0.0833 | 0.9722 | 0.9722 | 0.9722 | 0.0149 | No | Laborers | 0.139376 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | Unaccompanied | 2.0 | 2.0 | 2.0 | 2.0 | 0.262949 | 351000.0 | 24700.5 | 1.0 | -1134.0 | 10 | 0 | Business Entity Type 3 | Cash loans | N | M | 406597.5 | 202500.0 | 0 | Working | Single / not married | House / apartment | 0.018801 | Secondary / secondary special | -9461 | -637 | -3648.0 | -2120 | 1 | 1 | 0 | 1 | 1 | Y | 0 | 0 | 2 | 2 | WEDNESDAY | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 100003 | 0.0605 | 0.0608 | 0.0497 | 0.0039 | 0.0039 | 0.0 | reg oper account | 0.0787 | 0.0773 | 0.079 | 0.3333 | 0.3333 | 0.3333 | 0.8040 | 0.7987 | 0.7960 | NaN | 0.0132 | 0.0130 | 0.0128 | 0.0538 | 0.0529 | 0.0529 | 0.311267 | 0.0098 | 0.0 | 0.01 | 0.08 | 0.08 | 0.0806 | Block | 0.0959 | 0.0924 | 0.0968 | 0.0345 | 0.0345 | 0.0345 | 0.0549 | 0.0554 | 0.0558 | block of flats | 0.2917 | 0.2917 | 0.2917 | 0.9851 | 0.9851 | 0.9851 | 0.0714 | No | Core staff | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | Family | 1.0 | 0.0 | 1.0 | 0.0 | 0.622246 | 1129500.0 | 35698.5 | 2.0 | -828.0 | 11 | 0 | School | Cash loans | N | F | 1293502.5 | 270000.0 | 0 | State servant | Married | House / apartment | 0.003541 | Higher education | -16765 | -1188 | -1186.0 | -291 | 1 | 1 | 0 | 1 | 0 | N | 0 | 0 | 1 | 1 | MONDAY | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 100004 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 26.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Laborers | 0.729567 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | Unaccompanied | 0.0 | 0.0 | 0.0 | 0.0 | 0.555912 | 135000.0 | 6750.0 | 1.0 | -815.0 | 9 | 0 | Government | Revolving loans | Y | M | 135000.0 | 67500.0 | 0 | Working | Single / not married | House / apartment | 0.010032 | Secondary / secondary special | -19046 | -225 | -4260.0 | -2531 | 1 | 1 | 1 | 1 | 0 | Y | 0 | 0 | 2 | 2 | MONDAY | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 100006 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Laborers | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Unaccompanied | 2.0 | 0.0 | 2.0 | 0.0 | 0.650442 | 297000.0 | 29686.5 | 2.0 | -617.0 | 17 | 0 | Business Entity Type 3 | Cash loans | N | F | 312682.5 | 135000.0 | 0 | Working | Civil marriage | House / apartment | 0.008019 | Secondary / secondary special | -19005 | -3039 | -9833.0 | -2437 | 1 | 1 | 0 | 1 | 0 | Y | 0 | 0 | 2 | 2 | WEDNESDAY | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 100007 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Core staff | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | Unaccompanied | 0.0 | 0.0 | 0.0 | 0.0 | 0.322738 | 513000.0 | 21865.5 | 1.0 | -1106.0 | 11 | 0 | Religion | Cash loans | N | M | 513000.0 | 121500.0 | 0 | Working | Single / not married | House / apartment | 0.028663 | Secondary / secondary special | -19932 | -3038 | -4311.0 | -3458 | 1 | 1 | 0 | 1 | 0 | Y | 0 | 0 | 2 | 2 | THURSDAY | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
df_loan.columns
Index(['SK_ID_CURR', 'COMMONAREA_AVG', 'COMMONAREA_MEDI', 'COMMONAREA_MODE',
'NONLIVINGAPARTMENTS_AVG', 'NONLIVINGAPARTMENTS_MEDI',
'NONLIVINGAPARTMENTS_MODE', 'FONDKAPREMONT_MODE',
'LIVINGAPARTMENTS_MEDI', 'LIVINGAPARTMENTS_AVG',
...
'FLAG_DOCUMENT_12', 'FLAG_DOCUMENT_13', 'FLAG_DOCUMENT_19',
'FLAG_DOCUMENT_18', 'FLAG_DOCUMENT_17', 'FLAG_DOCUMENT_16',
'FLAG_DOCUMENT_15', 'FLAG_DOCUMENT_14', 'FLAG_DOCUMENT_20',
'FLAG_DOCUMENT_21'],
dtype='object', length=122)
Cambio de tipo de variables categóricas¶
Cambio el tipo de las variables object a category
list_var_cat, other = f_aux.dame_variables_categoricas(dataset=df_loan)
df_loan[list_var_cat] = df_loan[list_var_cat].astype("category")
list_var_continuous = list(df_loan.select_dtypes('float').columns)
df_loan[list_var_continuous] = df_loan[list_var_continuous].astype(float)
df_loan.dtypes
SK_ID_CURR int64 COMMONAREA_AVG float64 COMMONAREA_MEDI float64 COMMONAREA_MODE float64 NONLIVINGAPARTMENTS_AVG float64 NONLIVINGAPARTMENTS_MEDI float64 NONLIVINGAPARTMENTS_MODE float64 FONDKAPREMONT_MODE category LIVINGAPARTMENTS_MEDI float64 LIVINGAPARTMENTS_AVG float64 LIVINGAPARTMENTS_MODE float64 FLOORSMIN_MODE float64 FLOORSMIN_AVG float64 FLOORSMIN_MEDI float64 YEARS_BUILD_MODE float64 YEARS_BUILD_MEDI float64 YEARS_BUILD_AVG float64 OWN_CAR_AGE float64 LANDAREA_MEDI float64 LANDAREA_AVG float64 LANDAREA_MODE float64 BASEMENTAREA_MODE float64 BASEMENTAREA_AVG float64 BASEMENTAREA_MEDI float64 EXT_SOURCE_1 float64 NONLIVINGAREA_AVG float64 NONLIVINGAREA_MODE float64 NONLIVINGAREA_MEDI float64 ELEVATORS_MEDI float64 ELEVATORS_AVG float64 ELEVATORS_MODE float64 WALLSMATERIAL_MODE category APARTMENTS_AVG float64 APARTMENTS_MODE float64 APARTMENTS_MEDI float64 ENTRANCES_MEDI float64 ENTRANCES_MODE float64 ENTRANCES_AVG float64 LIVINGAREA_AVG float64 LIVINGAREA_MODE float64 LIVINGAREA_MEDI float64 HOUSETYPE_MODE category FLOORSMAX_MODE float64 FLOORSMAX_AVG float64 FLOORSMAX_MEDI float64 YEARS_BEGINEXPLUATATION_MODE float64 YEARS_BEGINEXPLUATATION_AVG float64 YEARS_BEGINEXPLUATATION_MEDI float64 TOTALAREA_MODE float64 EMERGENCYSTATE_MODE category OCCUPATION_TYPE category EXT_SOURCE_3 float64 AMT_REQ_CREDIT_BUREAU_WEEK float64 AMT_REQ_CREDIT_BUREAU_MON float64 AMT_REQ_CREDIT_BUREAU_HOUR float64 AMT_REQ_CREDIT_BUREAU_DAY float64 AMT_REQ_CREDIT_BUREAU_YEAR float64 AMT_REQ_CREDIT_BUREAU_QRT float64 NAME_TYPE_SUITE category OBS_60_CNT_SOCIAL_CIRCLE float64 DEF_60_CNT_SOCIAL_CIRCLE float64 OBS_30_CNT_SOCIAL_CIRCLE float64 DEF_30_CNT_SOCIAL_CIRCLE float64 EXT_SOURCE_2 float64 AMT_GOODS_PRICE float64 AMT_ANNUITY float64 CNT_FAM_MEMBERS float64 DAYS_LAST_PHONE_CHANGE float64 HOUR_APPR_PROCESS_START int64 REG_REGION_NOT_LIVE_REGION int64 ORGANIZATION_TYPE category NAME_CONTRACT_TYPE category FLAG_OWN_CAR category CODE_GENDER category AMT_CREDIT float64 AMT_INCOME_TOTAL float64 CNT_CHILDREN int64 NAME_INCOME_TYPE category NAME_FAMILY_STATUS category NAME_HOUSING_TYPE category REGION_POPULATION_RELATIVE float64 NAME_EDUCATION_TYPE category DAYS_BIRTH int64 DAYS_EMPLOYED int64 DAYS_REGISTRATION float64 DAYS_ID_PUBLISH int64 FLAG_MOBIL int64 FLAG_EMP_PHONE int64 FLAG_WORK_PHONE int64 FLAG_CONT_MOBILE int64 TARGET int64 FLAG_OWN_REALTY category LIVE_REGION_NOT_WORK_REGION int64 FLAG_EMAIL int64 REGION_RATING_CLIENT int64 REGION_RATING_CLIENT_W_CITY int64 WEEKDAY_APPR_PROCESS_START category FLAG_PHONE int64 REG_CITY_NOT_LIVE_CITY int64 REG_CITY_NOT_WORK_CITY int64 LIVE_CITY_NOT_WORK_CITY int64 REG_REGION_NOT_WORK_REGION int64 FLAG_DOCUMENT_4 int64 FLAG_DOCUMENT_5 int64 FLAG_DOCUMENT_2 int64 FLAG_DOCUMENT_3 int64 FLAG_DOCUMENT_11 int64 FLAG_DOCUMENT_10 int64 FLAG_DOCUMENT_9 int64 FLAG_DOCUMENT_8 int64 FLAG_DOCUMENT_7 int64 FLAG_DOCUMENT_6 int64 FLAG_DOCUMENT_12 int64 FLAG_DOCUMENT_13 int64 FLAG_DOCUMENT_19 int64 FLAG_DOCUMENT_18 int64 FLAG_DOCUMENT_17 int64 FLAG_DOCUMENT_16 int64 FLAG_DOCUMENT_15 int64 FLAG_DOCUMENT_14 int64 FLAG_DOCUMENT_20 int64 FLAG_DOCUMENT_21 int64 dtype: object
Separación Train-Test estratificada¶
Separaré el dataset en train y test manteniendo la proporción de la variable objetivo. Pero antes, voy a graficar la proporción de dicha variable.
target_count = df_loan.groupby('TARGET').agg({'TARGET':'count'}).reset_index(drop=True)
target_count['value'] = list(target_count.index)
target_count
| TARGET | value | |
|---|---|---|
| 0 | 282686 | 0 |
| 1 | 24825 | 1 |
df_plot_loan_status = df_loan['TARGET']\
.value_counts(normalize=True)\
.mul(100).rename('percent').reset_index()
df_plot_loan_status_conteo = df_loan['TARGET'].value_counts(normalize=True).reset_index()
df_plot_loan_status_conteo
| TARGET | proportion | |
|---|---|---|
| 0 | 0 | 0.919271 |
| 1 | 1 | 0.080729 |
sns.set_theme(style="whitegrid")
fig, ax = plt.subplots(figsize=(10, 6)) # Aumenta el tamaño de la gráfica
# Grafico de barras
sns.barplot(
data=target_count,
x='value',
y='TARGET',
ax=ax,
hue='value',
dodge=False, # Evita separación entre barras
palette="pastel",
edgecolor="0.2" # Añade bordes a las barras
)
# Título y etiquetas de ejes
ax.set_title('Conteo de valores de la variable TARGET', fontsize=18, fontweight='bold', color='darkblue')
ax.set_ylabel('Count', fontsize=14, color='darkgrey')
ax.set_xlabel('Value', fontsize=14, color='darkgrey')
# Añade las etiquetas de conteo encima de las barras
for container in ax.containers:
ax.bar_label(container, fmt='{:,.0f}', label_type="edge", padding=3, fontsize=12, color="black")
sns.set_theme(style="whitegrid")
fig, ax = plt.subplots(figsize=(10, 6)) # Aumenta el tamaño de la gráfica
# Grafico de barras
sns.barplot(
data=df_plot_loan_status_conteo,
x='TARGET',
y='proportion',
ax=ax,
hue='TARGET',
dodge=False, # Evita separación entre barras
palette="pastel",
edgecolor="0.2" # Añade bordes a las barras
)
# Título y etiquetas de ejes
ax.set_title('Conteo de valores de la variable TARGET', fontsize=18, fontweight='bold', color='darkblue')
ax.set_ylabel('Count', fontsize=14, color='darkgrey')
ax.set_xlabel('Value', fontsize=14, color='darkgrey')
# Añade las etiquetas de conteo encima de las barras
for container in ax.containers:
ax.bar_label(container, fmt='{:,.2%}', label_type="edge", padding=3, fontsize=12, color="black")
Calculé y grafiqué los valores de la variable Target para combrobar que al realizar la separación en train y test las proporciones se mantengan gracias a la estratificación.
from sklearn.model_selection import train_test_split
X_df_loan, X_df_loan_test, y_df_loan, y_df_loan_test = train_test_split(df_loan.drop('TARGET',axis=1),
df_loan['TARGET'],
stratify=df_loan['TARGET'],
test_size=0.2)
df_loan_train = pd.concat([X_df_loan, y_df_loan],axis=1)
df_loan_test = pd.concat([X_df_loan_test, y_df_loan_test],axis=1)
print(f'''
\033[1mTRAIN\033[0m:
{y_df_loan.value_counts(normalize=True)}
\033[1mTEST\033[0m:
{y_df_loan_test.value_counts(normalize=True)}''')
TRAIN: TARGET 0 0.919271 1 0.080729 Name: proportion, dtype: float64 TEST: TARGET 0 0.919272 1 0.080728 Name: proportion, dtype: float64
La separación estratificada se realizó correctamente. Observamos la misma proporción de la variable TARGET tanto en train como en test.
Visualización descriptiva de los datos¶
Vamos a observar la proporción de valores nulos en columnas y filas, además de una visualización descriptiva de la relación de las demás variables con la variable TARGET
pd_series_null_columns = df_loan_train.isnull().sum().sort_values(ascending=False)
pd_series_null_rows = df_loan_train.isnull().sum(axis=1).sort_values(ascending=False)
print(pd_series_null_columns.shape, pd_series_null_rows.shape)
pd_null_columnas = pd.DataFrame(pd_series_null_columns, columns=['nulos_columnas'])
pd_null_filas = pd.DataFrame(pd_series_null_rows, columns=['nulos_filas'])
pd_null_filas['TARGET'] = df_loan['TARGET'].copy()
pd_null_columnas['porcentaje_columnas'] = pd_null_columnas['nulos_columnas']/df_loan_train.shape[0]
pd_null_filas['porcentaje_filas']= pd_null_filas['nulos_filas']/df_loan_train.shape[1]
(122,) (246008,)
pd_null_columnas
| nulos_columnas | porcentaje_columnas | |
|---|---|---|
| COMMONAREA_AVG | 171905 | 0.698778 |
| COMMONAREA_MEDI | 171905 | 0.698778 |
| COMMONAREA_MODE | 171905 | 0.698778 |
| NONLIVINGAPARTMENTS_MODE | 170809 | 0.694323 |
| NONLIVINGAPARTMENTS_AVG | 170809 | 0.694323 |
| NONLIVINGAPARTMENTS_MEDI | 170809 | 0.694323 |
| FONDKAPREMONT_MODE | 168190 | 0.683677 |
| LIVINGAPARTMENTS_MEDI | 168161 | 0.683559 |
| LIVINGAPARTMENTS_AVG | 168161 | 0.683559 |
| LIVINGAPARTMENTS_MODE | 168161 | 0.683559 |
| FLOORSMIN_AVG | 166886 | 0.678376 |
| FLOORSMIN_MODE | 166886 | 0.678376 |
| FLOORSMIN_MEDI | 166886 | 0.678376 |
| YEARS_BUILD_MODE | 163614 | 0.665076 |
| YEARS_BUILD_AVG | 163614 | 0.665076 |
| YEARS_BUILD_MEDI | 163614 | 0.665076 |
| OWN_CAR_AGE | 162412 | 0.660190 |
| LANDAREA_MEDI | 146058 | 0.593712 |
| LANDAREA_MODE | 146058 | 0.593712 |
| LANDAREA_AVG | 146058 | 0.593712 |
| BASEMENTAREA_AVG | 143933 | 0.585074 |
| BASEMENTAREA_MODE | 143933 | 0.585074 |
| BASEMENTAREA_MEDI | 143933 | 0.585074 |
| EXT_SOURCE_1 | 138614 | 0.563453 |
| NONLIVINGAREA_AVG | 135696 | 0.551592 |
| NONLIVINGAREA_MODE | 135696 | 0.551592 |
| NONLIVINGAREA_MEDI | 135696 | 0.551592 |
| ELEVATORS_MODE | 131147 | 0.533101 |
| ELEVATORS_MEDI | 131147 | 0.533101 |
| ELEVATORS_AVG | 131147 | 0.533101 |
| WALLSMATERIAL_MODE | 125108 | 0.508553 |
| APARTMENTS_MEDI | 124908 | 0.507740 |
| APARTMENTS_AVG | 124908 | 0.507740 |
| APARTMENTS_MODE | 124908 | 0.507740 |
| ENTRANCES_MODE | 123895 | 0.503622 |
| ENTRANCES_MEDI | 123895 | 0.503622 |
| ENTRANCES_AVG | 123895 | 0.503622 |
| LIVINGAREA_AVG | 123444 | 0.501789 |
| LIVINGAREA_MEDI | 123444 | 0.501789 |
| LIVINGAREA_MODE | 123444 | 0.501789 |
| HOUSETYPE_MODE | 123422 | 0.501699 |
| FLOORSMAX_MODE | 122459 | 0.497785 |
| FLOORSMAX_MEDI | 122459 | 0.497785 |
| FLOORSMAX_AVG | 122459 | 0.497785 |
| YEARS_BEGINEXPLUATATION_AVG | 120009 | 0.487826 |
| YEARS_BEGINEXPLUATATION_MODE | 120009 | 0.487826 |
| YEARS_BEGINEXPLUATATION_MEDI | 120009 | 0.487826 |
| TOTALAREA_MODE | 118723 | 0.482598 |
| EMERGENCYSTATE_MODE | 116615 | 0.474029 |
| OCCUPATION_TYPE | 76962 | 0.312843 |
| EXT_SOURCE_3 | 48773 | 0.198258 |
| AMT_REQ_CREDIT_BUREAU_HOUR | 33129 | 0.134666 |
| AMT_REQ_CREDIT_BUREAU_WEEK | 33129 | 0.134666 |
| AMT_REQ_CREDIT_BUREAU_MON | 33129 | 0.134666 |
| AMT_REQ_CREDIT_BUREAU_YEAR | 33129 | 0.134666 |
| AMT_REQ_CREDIT_BUREAU_DAY | 33129 | 0.134666 |
| AMT_REQ_CREDIT_BUREAU_QRT | 33129 | 0.134666 |
| NAME_TYPE_SUITE | 1040 | 0.004228 |
| DEF_30_CNT_SOCIAL_CIRCLE | 821 | 0.003337 |
| OBS_60_CNT_SOCIAL_CIRCLE | 821 | 0.003337 |
| DEF_60_CNT_SOCIAL_CIRCLE | 821 | 0.003337 |
| OBS_30_CNT_SOCIAL_CIRCLE | 821 | 0.003337 |
| EXT_SOURCE_2 | 537 | 0.002183 |
| AMT_GOODS_PRICE | 217 | 0.000882 |
| AMT_ANNUITY | 11 | 0.000045 |
| CNT_FAM_MEMBERS | 1 | 0.000004 |
| DAYS_LAST_PHONE_CHANGE | 1 | 0.000004 |
| SK_ID_CURR | 0 | 0.000000 |
| HOUR_APPR_PROCESS_START | 0 | 0.000000 |
| REG_REGION_NOT_LIVE_REGION | 0 | 0.000000 |
| ORGANIZATION_TYPE | 0 | 0.000000 |
| NAME_CONTRACT_TYPE | 0 | 0.000000 |
| FLAG_OWN_CAR | 0 | 0.000000 |
| CODE_GENDER | 0 | 0.000000 |
| AMT_CREDIT | 0 | 0.000000 |
| AMT_INCOME_TOTAL | 0 | 0.000000 |
| CNT_CHILDREN | 0 | 0.000000 |
| NAME_INCOME_TYPE | 0 | 0.000000 |
| NAME_FAMILY_STATUS | 0 | 0.000000 |
| NAME_HOUSING_TYPE | 0 | 0.000000 |
| REGION_POPULATION_RELATIVE | 0 | 0.000000 |
| NAME_EDUCATION_TYPE | 0 | 0.000000 |
| DAYS_BIRTH | 0 | 0.000000 |
| DAYS_EMPLOYED | 0 | 0.000000 |
| DAYS_REGISTRATION | 0 | 0.000000 |
| DAYS_ID_PUBLISH | 0 | 0.000000 |
| FLAG_MOBIL | 0 | 0.000000 |
| FLAG_EMP_PHONE | 0 | 0.000000 |
| FLAG_WORK_PHONE | 0 | 0.000000 |
| FLAG_CONT_MOBILE | 0 | 0.000000 |
| FLAG_OWN_REALTY | 0 | 0.000000 |
| LIVE_REGION_NOT_WORK_REGION | 0 | 0.000000 |
| FLAG_EMAIL | 0 | 0.000000 |
| REGION_RATING_CLIENT | 0 | 0.000000 |
| REGION_RATING_CLIENT_W_CITY | 0 | 0.000000 |
| WEEKDAY_APPR_PROCESS_START | 0 | 0.000000 |
| FLAG_PHONE | 0 | 0.000000 |
| REG_CITY_NOT_LIVE_CITY | 0 | 0.000000 |
| REG_CITY_NOT_WORK_CITY | 0 | 0.000000 |
| LIVE_CITY_NOT_WORK_CITY | 0 | 0.000000 |
| REG_REGION_NOT_WORK_REGION | 0 | 0.000000 |
| FLAG_DOCUMENT_4 | 0 | 0.000000 |
| FLAG_DOCUMENT_5 | 0 | 0.000000 |
| FLAG_DOCUMENT_2 | 0 | 0.000000 |
| FLAG_DOCUMENT_3 | 0 | 0.000000 |
| FLAG_DOCUMENT_11 | 0 | 0.000000 |
| FLAG_DOCUMENT_10 | 0 | 0.000000 |
| FLAG_DOCUMENT_9 | 0 | 0.000000 |
| FLAG_DOCUMENT_8 | 0 | 0.000000 |
| FLAG_DOCUMENT_7 | 0 | 0.000000 |
| FLAG_DOCUMENT_6 | 0 | 0.000000 |
| FLAG_DOCUMENT_12 | 0 | 0.000000 |
| FLAG_DOCUMENT_13 | 0 | 0.000000 |
| FLAG_DOCUMENT_19 | 0 | 0.000000 |
| FLAG_DOCUMENT_18 | 0 | 0.000000 |
| FLAG_DOCUMENT_17 | 0 | 0.000000 |
| FLAG_DOCUMENT_16 | 0 | 0.000000 |
| FLAG_DOCUMENT_15 | 0 | 0.000000 |
| FLAG_DOCUMENT_14 | 0 | 0.000000 |
| FLAG_DOCUMENT_20 | 0 | 0.000000 |
| FLAG_DOCUMENT_21 | 0 | 0.000000 |
| TARGET | 0 | 0.000000 |
pd_null_filas
| nulos_filas | TARGET | porcentaje_filas | |
|---|---|---|---|
| 269786 | 61 | 0 | 0.5 |
| 69707 | 61 | 0 | 0.5 |
| 244833 | 61 | 0 | 0.5 |
| 197736 | 61 | 0 | 0.5 |
| 150206 | 61 | 0 | 0.5 |
| ... | ... | ... | ... |
| 134994 | 0 | 0 | 0.0 |
| 85268 | 0 | 0 | 0.0 |
| 216116 | 0 | 1 | 0.0 |
| 156655 | 0 | 0 | 0.0 |
| 245999 | 0 | 0 | 0.0 |
246008 rows × 3 columns
Vamos a visualizar la distribución de las variables numéricas y categóricas con la variable TARGET
Genero listas por tipos de variables para visualizarlas a continuación.
df_loan_bool, df_loan_cat, df_loan_num = f_aux.tipos_vars1(df_loan,False)
warnings.filterwarnings('ignore')
for i in list(df_loan_train.columns):
if i in df_loan_num:
f_aux.double_plot(df_loan_train, col_name=i, is_cont=True, target='TARGET')
elif ((i in df_loan_bool) | (i in df_loan_cat)) & (i!='TARGET'):
f_aux.double_plot(df_loan_train, col_name=i, is_cont=False, target='TARGET')
df_loan_train['ORGANIZATION_TYPE'] = df_loan_train['ORGANIZATION_TYPE'].astype('category')
f_aux.double_plot(df_loan_train, col_name='ORGANIZATION_TYPE', is_cont=False, target='TARGET')
Análisis del gráfico¶
Cuando observamos las variables representadas visualmente, se dejan ver algunos detalles a tener en cuenta. Como el desbalanceo de la variable objetivo que ya había mencionado con anterioridad, o la cantidad de valores nulos de algunas variables que posteriormente transformaremos. Vamos a comentar el comportamiento de algunas variables en relación a nuestra variable objetivo TARGET.
Los clientes que tienen coches más antiguos se suelen retrasar en el pago del préstamo.
La dificultad en el pago del préstamo parece aumentar en los clientes con un score más bajo según la variable EXT_SOURCE_1, EXT_SOURCE_2 Y EXT_SOURCE_3 correspondiente a un score normalizado de una fuente de datos externa.
Los clientes con materiales de madera en las paredes de sus viviendas son los más propensos a retrasarse en el pago del préstamo.
Los clientes que tienen puestos de trabajo menos cualificados (low-skill laborers, drivers, waiters) presentan mayor probabilidad de retrasarse en el pago del préstamo.
Conforme aumenta el número de consultas de crédito antes de la solicitud del préstamo (AMT_REQ_CREDIT_BUREAU), más aumenta la probabilidad de que se retrase en la devolución del mismo.
Cuanto mayor es el tamaño de la familia del cliente más probabilidad en que se retrase en alguno de los pagos del préstamo.
Se puede observar que si el cliente cambió de teléfono móvil (DAYS_LAST_PHONE_CHANGE) hace relativamente poco tiempo, aumenta la probabilidad de que pueda tener dificultades en el pago del préstamo.
Los hombres son más propensos que las mujeres a tener dificultades en el pago del préstamo (CODE_GENDER).
Cuanto mayor sea la cantidad de hijos que tiene el cliente, mayor será la dificultad de pago que tendrá (CNT_CHILDREN).
Los clientes de baja por maternidad o desempleados son más propensos a tener dificultad en el pago del préstamo (NAME_INCOME_TYPE).
Los clientes con una mayor educación son menos propensos a tener dificultades a la hora de devolver el préstamo (NAME_EDUCATION_TYPE).
Parece que cuanto más jóven es el cliente (DAYS_BIRTH) tendrá más dificultades para el pago del préstamo.
Los clientes que cambiaron su documento de ID poco antes de solicitar el préstamo (DAYS_ID_PUBLISH), además de si cambió su registro (DAYS_REGISTRATION) poco antes de la solicitud del préstamo, tendrá más dificultades para el pago del mismo.
Cuanto mayor es el score de la región donde vive el cliente (REGION_RATING_CLIENT), mayor es la probabilidad de que tenga dificultades para el pago del préstamo.
Los clientes que dieron el FLAG_DOCUMENT_2 tienen mayor probabilidad de tener dificultades en el pago del préstamo.
f_aux.get_deviation_of_mean_perc(df_loan_train, list_var_continuous, target='TARGET', multiplier=3)
| 0.0 | 1.0 | variable | sum_outlier_values | porcentaje_sum_null_values | |
|---|---|---|---|---|---|
| 0 | 0.954442 | 0.045558 | COMMONAREA_AVG | 1317 | 0.005353 |
| 1 | 0.953558 | 0.046442 | COMMONAREA_MEDI | 1335 | 0.005427 |
| 2 | 0.949962 | 0.050038 | COMMONAREA_MODE | 1319 | 0.005362 |
| 3 | 0.935264 | 0.064736 | NONLIVINGAPARTMENTS_AVG | 587 | 0.002386 |
| 4 | 0.931389 | 0.068611 | NONLIVINGAPARTMENTS_MEDI | 583 | 0.002370 |
| 5 | 0.925182 | 0.074818 | NONLIVINGAPARTMENTS_MODE | 548 | 0.002228 |
| 6 | 0.951567 | 0.048433 | LIVINGAPARTMENTS_MEDI | 1404 | 0.005707 |
| 7 | 0.953305 | 0.046695 | LIVINGAPARTMENTS_AVG | 1392 | 0.005658 |
| 8 | 0.950912 | 0.049088 | LIVINGAPARTMENTS_MODE | 1426 | 0.005797 |
| 9 | 0.963351 | 0.036649 | FLOORSMIN_MODE | 382 | 0.001553 |
| 10 | 0.963441 | 0.036559 | FLOORSMIN_AVG | 465 | 0.001890 |
| 11 | 0.961364 | 0.038636 | FLOORSMIN_MEDI | 440 | 0.001789 |
| 12 | 0.920969 | 0.079031 | YEARS_BUILD_MODE | 949 | 0.003858 |
| 13 | 0.921218 | 0.078782 | YEARS_BUILD_MEDI | 952 | 0.003870 |
| 14 | 0.920298 | 0.079702 | YEARS_BUILD_AVG | 941 | 0.003825 |
| 15 | 0.915503 | 0.084497 | OWN_CAR_AGE | 2722 | 0.011065 |
| 16 | 0.941418 | 0.058582 | LANDAREA_MEDI | 1707 | 0.006939 |
| 17 | 0.938360 | 0.061640 | LANDAREA_AVG | 1671 | 0.006792 |
| 18 | 0.937830 | 0.062170 | LANDAREA_MODE | 1705 | 0.006931 |
| 19 | 0.945765 | 0.054235 | BASEMENTAREA_MODE | 1641 | 0.006671 |
| 20 | 0.946727 | 0.053273 | BASEMENTAREA_AVG | 1558 | 0.006333 |
| 21 | 0.946599 | 0.053401 | BASEMENTAREA_MEDI | 1573 | 0.006394 |
| 22 | 0.944530 | 0.055470 | NONLIVINGAREA_AVG | 1947 | 0.007914 |
| 23 | 0.945399 | 0.054601 | NONLIVINGAREA_MODE | 1978 | 0.008040 |
| 24 | 0.945212 | 0.054788 | NONLIVINGAREA_MEDI | 1953 | 0.007939 |
| 25 | 0.957623 | 0.042377 | ELEVATORS_MEDI | 1935 | 0.007866 |
| 26 | 0.958355 | 0.041645 | ELEVATORS_AVG | 1945 | 0.007906 |
| 27 | 0.951776 | 0.048224 | ELEVATORS_MODE | 2675 | 0.010874 |
| 28 | 0.951858 | 0.048142 | APARTMENTS_AVG | 2368 | 0.009626 |
| 29 | 0.951209 | 0.048791 | APARTMENTS_MODE | 2398 | 0.009748 |
| 30 | 0.951300 | 0.048700 | APARTMENTS_MEDI | 2423 | 0.009849 |
| 31 | 0.937675 | 0.062325 | ENTRANCES_MEDI | 1781 | 0.007240 |
| 32 | 0.941121 | 0.058879 | ENTRANCES_MODE | 2106 | 0.008561 |
| 33 | 0.938453 | 0.061547 | ENTRANCES_AVG | 1771 | 0.007199 |
| 34 | 0.951859 | 0.048141 | LIVINGAREA_AVG | 2555 | 0.010386 |
| 35 | 0.949461 | 0.050539 | LIVINGAREA_MODE | 2691 | 0.010939 |
| 36 | 0.952177 | 0.047823 | LIVINGAREA_MEDI | 2572 | 0.010455 |
| 37 | 0.959376 | 0.040624 | FLOORSMAX_MODE | 2117 | 0.008605 |
| 38 | 0.957955 | 0.042045 | FLOORSMAX_AVG | 2093 | 0.008508 |
| 39 | 0.958106 | 0.041894 | FLOORSMAX_MEDI | 2196 | 0.008927 |
| 40 | 0.908411 | 0.091589 | YEARS_BEGINEXPLUATATION_MODE | 535 | 0.002175 |
| 41 | 0.910584 | 0.089416 | YEARS_BEGINEXPLUATATION_AVG | 548 | 0.002228 |
| 42 | 0.904854 | 0.095146 | YEARS_BEGINEXPLUATATION_MEDI | 515 | 0.002093 |
| 43 | 0.958506 | 0.041494 | TOTALAREA_MODE | 2651 | 0.010776 |
| 44 | 0.925448 | 0.074552 | AMT_REQ_CREDIT_BUREAU_WEEK | 6814 | 0.027698 |
| 45 | 0.946082 | 0.053918 | AMT_REQ_CREDIT_BUREAU_MON | 2578 | 0.010479 |
| 46 | 0.924266 | 0.075734 | AMT_REQ_CREDIT_BUREAU_HOUR | 1294 | 0.005260 |
| 47 | 0.914095 | 0.085905 | AMT_REQ_CREDIT_BUREAU_DAY | 1199 | 0.004874 |
| 48 | 0.907767 | 0.092233 | AMT_REQ_CREDIT_BUREAU_YEAR | 2678 | 0.010886 |
| 49 | 0.913961 | 0.086039 | AMT_REQ_CREDIT_BUREAU_QRT | 1848 | 0.007512 |
| 50 | 0.913829 | 0.086171 | OBS_60_CNT_SOCIAL_CIRCLE | 4816 | 0.019577 |
| 51 | 0.874327 | 0.125673 | DEF_60_CNT_SOCIAL_CIRCLE | 3159 | 0.012841 |
| 52 | 0.914217 | 0.085783 | OBS_30_CNT_SOCIAL_CIRCLE | 4966 | 0.020186 |
| 53 | 0.880065 | 0.119935 | DEF_30_CNT_SOCIAL_CIRCLE | 5503 | 0.022369 |
| 54 | 0.960299 | 0.039701 | AMT_GOODS_PRICE | 3350 | 0.013617 |
| 55 | 0.964621 | 0.035379 | AMT_ANNUITY | 2346 | 0.009536 |
| 56 | 0.901660 | 0.098340 | CNT_FAM_MEMBERS | 3193 | 0.012979 |
| 57 | 0.949119 | 0.050881 | DAYS_LAST_PHONE_CHANGE | 511 | 0.002077 |
| 58 | 0.957310 | 0.042690 | AMT_CREDIT | 2647 | 0.010760 |
| 59 | 0.942857 | 0.057143 | AMT_INCOME_TOTAL | 210 | 0.000854 |
| 60 | 0.958470 | 0.041530 | REGION_POPULATION_RELATIVE | 6718 | 0.027308 |
| 61 | 0.960848 | 0.039152 | DAYS_REGISTRATION | 613 | 0.002492 |
- Las variables a destacar son 'AMT_CREDIT' siendo la cantidad total de dinero prestado al cliente y 'AMT_INCOME_TOTAL' siendo el ingreso total del cliente, pues estos valores pueden representar una importancia relativa en la variable 'TARGET'. Si tenemos en cuenta que el valor de nuestra variable target es que exista aproximadamente un 8% de dificultad de pago, no tendremos que preocuparnos por la cantidad de outliers que tenemos. La cantidad de outliers habrá que tenerla en cuenta pero a priori no deberían de afectar a las conclusiones finales debido a la cantidad tan reducida.
En otra instancia, destacar que los porcentajes de outliers son muy bajos prácticamente en todas las variables y no deberían de afectar significativamente a los resultados por lo que, por ahora procederé a mantenerlos.
Análisis de correlación entre las variables¶
Matriz de correlación para variables numéricas¶
corr = pd.concat([df_loan_train.select_dtypes('number').drop(df_loan_bool, axis=1), df_loan_train['TARGET']], axis=1).corr(method='pearson')
corr
| SK_ID_CURR | COMMONAREA_AVG | COMMONAREA_MEDI | COMMONAREA_MODE | NONLIVINGAPARTMENTS_AVG | NONLIVINGAPARTMENTS_MEDI | NONLIVINGAPARTMENTS_MODE | LIVINGAPARTMENTS_MEDI | LIVINGAPARTMENTS_AVG | LIVINGAPARTMENTS_MODE | FLOORSMIN_MODE | FLOORSMIN_AVG | FLOORSMIN_MEDI | YEARS_BUILD_MODE | YEARS_BUILD_MEDI | YEARS_BUILD_AVG | OWN_CAR_AGE | LANDAREA_MEDI | LANDAREA_AVG | LANDAREA_MODE | BASEMENTAREA_MODE | BASEMENTAREA_AVG | BASEMENTAREA_MEDI | EXT_SOURCE_1 | NONLIVINGAREA_AVG | NONLIVINGAREA_MODE | NONLIVINGAREA_MEDI | ELEVATORS_MEDI | ELEVATORS_AVG | ELEVATORS_MODE | APARTMENTS_AVG | APARTMENTS_MODE | APARTMENTS_MEDI | ENTRANCES_MEDI | ENTRANCES_MODE | ENTRANCES_AVG | LIVINGAREA_AVG | LIVINGAREA_MODE | LIVINGAREA_MEDI | FLOORSMAX_MODE | FLOORSMAX_AVG | FLOORSMAX_MEDI | YEARS_BEGINEXPLUATATION_MODE | YEARS_BEGINEXPLUATATION_AVG | YEARS_BEGINEXPLUATATION_MEDI | TOTALAREA_MODE | EXT_SOURCE_3 | AMT_REQ_CREDIT_BUREAU_WEEK | AMT_REQ_CREDIT_BUREAU_MON | AMT_REQ_CREDIT_BUREAU_HOUR | AMT_REQ_CREDIT_BUREAU_DAY | AMT_REQ_CREDIT_BUREAU_YEAR | AMT_REQ_CREDIT_BUREAU_QRT | OBS_60_CNT_SOCIAL_CIRCLE | DEF_60_CNT_SOCIAL_CIRCLE | OBS_30_CNT_SOCIAL_CIRCLE | DEF_30_CNT_SOCIAL_CIRCLE | EXT_SOURCE_2 | AMT_GOODS_PRICE | AMT_ANNUITY | CNT_FAM_MEMBERS | DAYS_LAST_PHONE_CHANGE | HOUR_APPR_PROCESS_START | AMT_CREDIT | AMT_INCOME_TOTAL | CNT_CHILDREN | REGION_POPULATION_RELATIVE | DAYS_BIRTH | DAYS_EMPLOYED | DAYS_REGISTRATION | DAYS_ID_PUBLISH | REGION_RATING_CLIENT | REGION_RATING_CLIENT_W_CITY | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SK_ID_CURR | 1.000000 | -0.000618 | -0.000306 | -0.000235 | -0.004044 | -0.004488 | -0.003741 | 0.003474 | 0.003429 | 0.004078 | 0.003772 | 0.004766 | 0.004513 | 0.007333 | 0.007869 | 0.008118 | 0.000983 | 0.003210 | 0.002934 | 0.003035 | -0.000687 | -0.001218 | -0.001100 | -0.000098 | 0.002460 | 0.001556 | 0.001669 | 0.005666 | 0.005552 | 0.005788 | 0.001911 | 0.002158 | 0.002255 | -0.002076 | -0.002179 | -0.002377 | 0.003940 | 0.004250 | 0.004374 | 0.005201 | 0.005760 | 0.005355 | 0.002445 | 0.002513 | 0.002298 | 0.003307 | -0.000007 | 0.001299 | 0.000227 | -0.002844 | -0.001018 | 0.004930 | -0.000050 | -0.001489 | 0.000678 | -0.001404 | -0.000575 | 0.001123 | 0.000227 | -0.000003 | -0.002231 | 0.000776 | 0.000205 | 0.000214 | -0.001795 | -0.000688 | 0.001271 | -0.000841 | 0.001274 | -0.000630 | -0.000887 | -0.001853 | -0.001741 | -0.000581 |
| COMMONAREA_AVG | -0.000618 | 1.000000 | 0.995723 | 0.976990 | 0.104080 | 0.103623 | 0.101982 | 0.532262 | 0.530703 | 0.523296 | 0.287190 | 0.294760 | 0.294020 | 0.226436 | 0.229366 | 0.229621 | -0.038274 | 0.254899 | 0.253077 | 0.240619 | 0.383025 | 0.401366 | 0.400316 | 0.032502 | 0.227503 | 0.215756 | 0.227223 | 0.518537 | 0.520095 | 0.501695 | 0.536826 | 0.511312 | 0.536078 | 0.322824 | 0.299515 | 0.325433 | 0.544066 | 0.519428 | 0.542972 | 0.395279 | 0.401736 | 0.400223 | 0.050956 | 0.095025 | 0.078857 | 0.550656 | -0.005499 | -0.009497 | 0.022451 | 0.006416 | -0.000265 | -0.014661 | -0.010515 | -0.020677 | -0.014209 | -0.021039 | -0.012428 | 0.053179 | 0.049932 | 0.056695 | 0.000262 | -0.002659 | 0.047662 | 0.049198 | 0.086203 | -0.000503 | 0.168101 | 0.006585 | -0.008967 | 0.024592 | -0.000485 | -0.120701 | -0.130876 | -0.021858 |
| COMMONAREA_MEDI | -0.000306 | 0.995723 | 1.000000 | 0.980186 | 0.104108 | 0.104648 | 0.103190 | 0.535049 | 0.531468 | 0.526520 | 0.285829 | 0.293205 | 0.292670 | 0.227598 | 0.230334 | 0.230275 | -0.037959 | 0.257784 | 0.255666 | 0.244055 | 0.386034 | 0.402608 | 0.402752 | 0.031532 | 0.227796 | 0.217994 | 0.229016 | 0.520443 | 0.520169 | 0.503396 | 0.537877 | 0.514492 | 0.539034 | 0.325659 | 0.302568 | 0.327092 | 0.545263 | 0.522608 | 0.545882 | 0.394667 | 0.400655 | 0.399626 | 0.051044 | 0.095260 | 0.079089 | 0.550483 | -0.005625 | -0.009552 | 0.022149 | 0.006569 | -0.000085 | -0.014401 | -0.010050 | -0.020014 | -0.013928 | -0.020368 | -0.012346 | 0.051516 | 0.048917 | 0.055852 | 0.000731 | -0.002478 | 0.046151 | 0.048203 | 0.084201 | -0.000145 | 0.163327 | 0.007296 | -0.009276 | 0.025303 | -0.000236 | -0.117366 | -0.127754 | -0.021818 |
| COMMONAREA_MODE | -0.000235 | 0.976990 | 0.980186 | 1.000000 | 0.101379 | 0.102925 | 0.106578 | 0.524858 | 0.520722 | 0.535876 | 0.276881 | 0.275963 | 0.275386 | 0.224705 | 0.221103 | 0.221461 | -0.032890 | 0.266620 | 0.263642 | 0.262352 | 0.402390 | 0.400002 | 0.402012 | 0.027070 | 0.220478 | 0.227433 | 0.224863 | 0.505053 | 0.503870 | 0.504118 | 0.527508 | 0.524077 | 0.529621 | 0.332168 | 0.321489 | 0.332897 | 0.534595 | 0.533735 | 0.536113 | 0.378377 | 0.376467 | 0.375441 | 0.049195 | 0.090068 | 0.074009 | 0.541181 | -0.004424 | -0.008405 | 0.019809 | 0.006513 | 0.000204 | -0.013372 | -0.009280 | -0.016636 | -0.013215 | -0.016998 | -0.011801 | 0.043665 | 0.041974 | 0.047572 | 0.000838 | -0.000391 | 0.040003 | 0.041446 | 0.072656 | -0.000906 | 0.134159 | 0.007584 | -0.009378 | 0.025497 | -0.000491 | -0.095498 | -0.107276 | -0.019588 |
| NONLIVINGAPARTMENTS_AVG | -0.004044 | 0.104080 | 0.104108 | 0.101379 | 1.000000 | 0.988800 | 0.968168 | 0.155765 | 0.160672 | 0.142623 | 0.069839 | 0.073014 | 0.072611 | 0.071022 | 0.071933 | 0.072432 | -0.027417 | 0.065266 | 0.063079 | 0.059149 | 0.091922 | 0.096333 | 0.095775 | 0.017335 | 0.217959 | 0.208738 | 0.216835 | 0.121778 | 0.121878 | 0.114281 | 0.196310 | 0.181238 | 0.192000 | 0.061096 | 0.052706 | 0.061623 | 0.136229 | 0.127699 | 0.135458 | 0.108526 | 0.113893 | 0.112877 | 0.020760 | 0.035872 | 0.032569 | 0.144837 | 0.009442 | -0.003654 | -0.000560 | 0.000469 | -0.001643 | 0.001379 | 0.002805 | -0.001056 | -0.001319 | -0.001377 | 0.001349 | 0.019233 | 0.014541 | 0.022276 | 0.002755 | 0.001123 | 0.014680 | 0.013413 | 0.030406 | 0.004179 | 0.024268 | 0.000849 | -0.002721 | 0.035364 | -0.008094 | -0.018347 | -0.021329 | -0.003702 |
| NONLIVINGAPARTMENTS_MEDI | -0.004488 | 0.103623 | 0.104648 | 0.102925 | 0.988800 | 1.000000 | 0.979302 | 0.156919 | 0.155997 | 0.144498 | 0.068459 | 0.071046 | 0.071093 | 0.069883 | 0.070533 | 0.070675 | -0.026958 | 0.062342 | 0.061601 | 0.057166 | 0.093564 | 0.096266 | 0.096361 | 0.016615 | 0.218406 | 0.211433 | 0.218793 | 0.121835 | 0.121147 | 0.115012 | 0.194906 | 0.184346 | 0.193856 | 0.062836 | 0.055111 | 0.062890 | 0.136504 | 0.129633 | 0.136379 | 0.107229 | 0.111649 | 0.111432 | 0.020289 | 0.034919 | 0.031826 | 0.144587 | 0.008861 | -0.003997 | -0.000965 | 0.000675 | -0.001680 | 0.001970 | 0.003295 | -0.000561 | -0.000911 | -0.000880 | 0.001888 | 0.018113 | 0.013412 | 0.021405 | 0.003062 | 0.001182 | 0.014174 | 0.012401 | 0.028913 | 0.004442 | 0.021699 | 0.000777 | -0.002782 | 0.034240 | -0.007466 | -0.015891 | -0.019139 | -0.002904 |
| NONLIVINGAPARTMENTS_MODE | -0.003741 | 0.101982 | 0.103190 | 0.106578 | 0.968168 | 0.979302 | 1.000000 | 0.146722 | 0.145581 | 0.146565 | 0.067889 | 0.066070 | 0.066105 | 0.067812 | 0.066010 | 0.066057 | -0.024587 | 0.062815 | 0.061931 | 0.062180 | 0.098268 | 0.095715 | 0.096317 | 0.015245 | 0.212423 | 0.214904 | 0.213502 | 0.116300 | 0.115512 | 0.115426 | 0.189573 | 0.186793 | 0.188568 | 0.065015 | 0.061913 | 0.064713 | 0.131247 | 0.132183 | 0.131377 | 0.101441 | 0.102779 | 0.102801 | 0.019254 | 0.032312 | 0.029473 | 0.139331 | 0.008848 | -0.004205 | -0.001375 | -0.000420 | -0.001305 | 0.002258 | 0.003143 | -0.000231 | -0.000246 | -0.000553 | 0.003069 | 0.016875 | 0.010851 | 0.017211 | 0.002576 | 0.000882 | 0.012107 | 0.010076 | 0.025624 | 0.004294 | 0.016331 | 0.001163 | -0.003421 | 0.032723 | -0.007737 | -0.010272 | -0.014123 | -0.001785 |
| LIVINGAPARTMENTS_MEDI | 0.003474 | 0.532262 | 0.535049 | 0.524858 | 0.155765 | 0.156919 | 0.146722 | 1.000000 | 0.993444 | 0.975784 | 0.433319 | 0.440796 | 0.439792 | 0.332742 | 0.333236 | 0.334181 | -0.049922 | 0.425089 | 0.421164 | 0.415587 | 0.629270 | 0.650066 | 0.651839 | 0.043665 | 0.292987 | 0.276488 | 0.292079 | 0.816285 | 0.814531 | 0.801272 | 0.943828 | 0.916230 | 0.944156 | 0.567007 | 0.537489 | 0.568360 | 0.884652 | 0.858999 | 0.886539 | 0.584335 | 0.590479 | 0.588101 | 0.088925 | 0.153387 | 0.131092 | 0.847531 | 0.000900 | -0.007432 | 0.032529 | 0.002651 | 0.003484 | -0.013095 | -0.008347 | -0.028310 | -0.016995 | -0.028816 | -0.015635 | 0.078604 | 0.061198 | 0.074110 | -0.004163 | -0.002901 | 0.078353 | 0.058731 | 0.105237 | -0.005822 | 0.190426 | 0.013687 | -0.020043 | 0.025284 | 0.000204 | -0.152176 | -0.176999 | -0.025916 |
| LIVINGAPARTMENTS_AVG | 0.003429 | 0.530703 | 0.531468 | 0.520722 | 0.160672 | 0.155997 | 0.145581 | 0.993444 | 1.000000 | 0.970003 | 0.432860 | 0.441105 | 0.439306 | 0.330950 | 0.331908 | 0.333106 | -0.050750 | 0.420759 | 0.417510 | 0.410504 | 0.624144 | 0.647704 | 0.646803 | 0.045369 | 0.292061 | 0.273629 | 0.289557 | 0.811084 | 0.813447 | 0.795752 | 0.945602 | 0.909630 | 0.936836 | 0.561134 | 0.531457 | 0.565461 | 0.881894 | 0.852971 | 0.879724 | 0.584088 | 0.591459 | 0.588124 | 0.088665 | 0.152964 | 0.130738 | 0.849248 | 0.001055 | -0.007485 | 0.032595 | 0.002833 | 0.003390 | -0.012730 | -0.008789 | -0.028427 | -0.017006 | -0.028928 | -0.015667 | 0.080303 | 0.062989 | 0.076515 | -0.004810 | -0.003382 | 0.079959 | 0.060508 | 0.107432 | -0.006488 | 0.195956 | 0.013299 | -0.020296 | 0.024839 | 0.000710 | -0.156766 | -0.181184 | -0.026580 |
| LIVINGAPARTMENTS_MODE | 0.004078 | 0.523296 | 0.526520 | 0.535876 | 0.142623 | 0.144498 | 0.146565 | 0.975784 | 0.970003 | 1.000000 | 0.431111 | 0.428039 | 0.427364 | 0.331959 | 0.324816 | 0.325796 | -0.044679 | 0.436411 | 0.431860 | 0.438350 | 0.653435 | 0.648962 | 0.651692 | 0.038305 | 0.284769 | 0.287368 | 0.286799 | 0.800515 | 0.798802 | 0.809301 | 0.931941 | 0.939327 | 0.933477 | 0.573767 | 0.566713 | 0.575143 | 0.874249 | 0.879962 | 0.875901 | 0.573404 | 0.569560 | 0.567508 | 0.087476 | 0.148304 | 0.126255 | 0.834733 | 0.001906 | -0.007015 | 0.030218 | 0.003853 | 0.003741 | -0.012366 | -0.008217 | -0.025142 | -0.017523 | -0.025629 | -0.016124 | 0.071318 | 0.054533 | 0.065992 | -0.004381 | -0.003171 | 0.072238 | 0.052481 | 0.092782 | -0.006230 | 0.164517 | 0.013336 | -0.019826 | 0.023973 | 0.000049 | -0.129571 | -0.155851 | -0.024955 |
| FLOORSMIN_MODE | 0.003772 | 0.287190 | 0.285829 | 0.276881 | 0.069839 | 0.068459 | 0.067889 | 0.433319 | 0.432860 | 0.431111 | 1.000000 | 0.986275 | 0.988711 | 0.354088 | 0.352507 | 0.352567 | -0.073863 | 0.152698 | 0.150064 | 0.149188 | 0.207001 | 0.220236 | 0.217340 | 0.067297 | 0.147103 | 0.136329 | 0.143140 | 0.500231 | 0.500414 | 0.496078 | 0.437226 | 0.424276 | 0.435556 | 0.034670 | 0.028875 | 0.037452 | 0.458830 | 0.444623 | 0.457790 | 0.727696 | 0.723655 | 0.724492 | 0.100572 | 0.168074 | 0.148876 | 0.446324 | 0.003778 | -0.001291 | 0.035653 | 0.003737 | 0.003338 | -0.008855 | -0.004238 | -0.035979 | -0.022872 | -0.036522 | -0.025390 | 0.106986 | 0.076515 | 0.094729 | -0.001186 | -0.006971 | 0.113720 | 0.074611 | 0.130492 | -0.009376 | 0.273877 | 0.000420 | -0.013644 | 0.019499 | -0.009859 | -0.215123 | -0.222929 | -0.033119 |
| FLOORSMIN_AVG | 0.004766 | 0.294760 | 0.293205 | 0.275963 | 0.073014 | 0.071046 | 0.066070 | 0.440796 | 0.441105 | 0.428039 | 0.986275 | 1.000000 | 0.997300 | 0.352802 | 0.358981 | 0.359817 | -0.076332 | 0.150093 | 0.147504 | 0.139141 | 0.199227 | 0.222760 | 0.219260 | 0.070879 | 0.153013 | 0.131155 | 0.146666 | 0.510074 | 0.511838 | 0.496145 | 0.445280 | 0.419621 | 0.442561 | 0.031725 | 0.016065 | 0.034497 | 0.467477 | 0.440933 | 0.465169 | 0.730044 | 0.743030 | 0.740669 | 0.101034 | 0.172300 | 0.152133 | 0.456486 | 0.002409 | -0.001575 | 0.039477 | 0.003833 | 0.003686 | -0.010269 | -0.004978 | -0.038168 | -0.023657 | -0.038671 | -0.026169 | 0.112450 | 0.080338 | 0.100250 | -0.002877 | -0.007270 | 0.119442 | 0.078129 | 0.139013 | -0.010143 | 0.292362 | 0.001133 | -0.014006 | 0.020757 | -0.009386 | -0.229994 | -0.236985 | -0.033705 |
| FLOORSMIN_MEDI | 0.004513 | 0.294020 | 0.292670 | 0.275386 | 0.072611 | 0.071093 | 0.066105 | 0.439792 | 0.439306 | 0.427364 | 0.988711 | 0.997300 | 1.000000 | 0.353089 | 0.359400 | 0.359322 | -0.076610 | 0.150355 | 0.147874 | 0.139709 | 0.198881 | 0.221369 | 0.218122 | 0.069767 | 0.152203 | 0.131145 | 0.146481 | 0.509386 | 0.509984 | 0.495428 | 0.443479 | 0.418987 | 0.441581 | 0.030663 | 0.015810 | 0.033887 | 0.465766 | 0.440049 | 0.464294 | 0.730901 | 0.740699 | 0.741322 | 0.100881 | 0.171914 | 0.152238 | 0.454403 | 0.002280 | -0.000978 | 0.038721 | 0.003881 | 0.003681 | -0.010540 | -0.004967 | -0.037967 | -0.023556 | -0.038457 | -0.026158 | 0.111551 | 0.079628 | 0.098972 | -0.002178 | -0.007243 | 0.118550 | 0.077513 | 0.137605 | -0.009670 | 0.288614 | 0.001302 | -0.014512 | 0.020821 | -0.009253 | -0.227258 | -0.234200 | -0.033636 |
| YEARS_BUILD_MODE | 0.007333 | 0.226436 | 0.227598 | 0.224705 | 0.071022 | 0.069883 | 0.067812 | 0.332742 | 0.330950 | 0.331959 | 0.354088 | 0.352802 | 0.353089 | 1.000000 | 0.989634 | 0.989766 | -0.043684 | 0.183206 | 0.181211 | 0.177669 | 0.243382 | 0.248353 | 0.247039 | 0.013757 | 0.125128 | 0.117844 | 0.124120 | 0.339740 | 0.338728 | 0.336509 | 0.337593 | 0.328968 | 0.337040 | 0.091311 | 0.085704 | 0.092882 | 0.352223 | 0.344393 | 0.352294 | 0.510358 | 0.508150 | 0.508380 | 0.302129 | 0.492266 | 0.438762 | 0.355397 | 0.014674 | -0.006569 | -0.004297 | 0.001198 | 0.001962 | -0.020694 | -0.006423 | 0.001401 | -0.011099 | 0.001537 | -0.010162 | 0.007695 | 0.038318 | 0.030641 | 0.041360 | 0.011749 | -0.016409 | 0.033075 | 0.038279 | 0.029196 | -0.064028 | 0.025823 | -0.006851 | 0.163429 | -0.009393 | 0.048298 | 0.040781 | -0.025586 |
| YEARS_BUILD_MEDI | 0.007869 | 0.229366 | 0.230334 | 0.221103 | 0.071933 | 0.070533 | 0.066010 | 0.333236 | 0.331908 | 0.324816 | 0.352507 | 0.358981 | 0.359400 | 0.989634 | 1.000000 | 0.998634 | -0.044703 | 0.179507 | 0.178186 | 0.168604 | 0.232960 | 0.246902 | 0.244974 | 0.014411 | 0.127607 | 0.112985 | 0.124902 | 0.342223 | 0.341850 | 0.333069 | 0.338268 | 0.321492 | 0.337094 | 0.085301 | 0.072267 | 0.087948 | 0.353487 | 0.337070 | 0.352896 | 0.511288 | 0.517014 | 0.517300 | 0.299885 | 0.497321 | 0.443892 | 0.357755 | 0.015024 | -0.006244 | -0.004164 | 0.001142 | 0.003460 | -0.021299 | -0.007438 | 0.000646 | -0.011636 | 0.000839 | -0.010555 | 0.010393 | 0.039981 | 0.032850 | 0.041839 | 0.011615 | -0.014470 | 0.034655 | 0.042482 | 0.029595 | -0.058163 | 0.027171 | -0.007974 | 0.164861 | -0.009253 | 0.043189 | 0.036414 | -0.025933 |
| YEARS_BUILD_AVG | 0.008118 | 0.229621 | 0.230275 | 0.221461 | 0.072432 | 0.070675 | 0.066057 | 0.334181 | 0.333106 | 0.325796 | 0.352567 | 0.359817 | 0.359322 | 0.989766 | 0.998634 | 1.000000 | -0.044935 | 0.179786 | 0.178408 | 0.168751 | 0.233523 | 0.247753 | 0.245664 | 0.014988 | 0.127629 | 0.112896 | 0.124702 | 0.342611 | 0.342693 | 0.333591 | 0.339399 | 0.322318 | 0.337949 | 0.086125 | 0.072993 | 0.088597 | 0.354490 | 0.337875 | 0.353607 | 0.511258 | 0.518305 | 0.517313 | 0.299906 | 0.497986 | 0.443345 | 0.359051 | 0.015181 | -0.006283 | -0.004172 | 0.001230 | 0.003057 | -0.021440 | -0.007304 | 0.000507 | -0.011478 | 0.000709 | -0.010424 | 0.010791 | 0.040326 | 0.033351 | 0.041869 | 0.011920 | -0.014282 | 0.034931 | 0.042782 | 0.029646 | -0.057069 | 0.026899 | -0.007603 | 0.165196 | -0.009454 | 0.042167 | 0.035435 | -0.025685 |
| OWN_CAR_AGE | 0.000983 | -0.038274 | -0.037959 | -0.032890 | -0.027417 | -0.026958 | -0.024587 | -0.049922 | -0.050750 | -0.044679 | -0.073863 | -0.076332 | -0.076610 | -0.043684 | -0.044703 | -0.044935 | 1.000000 | -0.021395 | -0.021384 | -0.019544 | -0.026777 | -0.032436 | -0.031287 | -0.081396 | -0.032484 | -0.028765 | -0.032077 | -0.065794 | -0.066436 | -0.061457 | -0.051160 | -0.045620 | -0.049992 | -0.016462 | -0.012329 | -0.017163 | -0.059801 | -0.054950 | -0.058606 | -0.080548 | -0.082869 | -0.082545 | 0.001837 | -0.000012 | 0.000043 | -0.061077 | -0.013837 | 0.003276 | -0.022521 | 0.003907 | -0.006480 | -0.015641 | -0.017527 | 0.005161 | 0.011677 | 0.005222 | 0.007421 | -0.081239 | -0.106258 | -0.099371 | -0.015176 | 0.002689 | -0.069504 | -0.096874 | -0.119654 | 0.009539 | -0.082891 | 0.007699 | 0.028075 | -0.025165 | 0.008747 | 0.086297 | 0.087654 | 0.039531 |
| LANDAREA_MEDI | 0.003210 | 0.254899 | 0.257784 | 0.266620 | 0.065266 | 0.062342 | 0.062815 | 0.425089 | 0.420759 | 0.436411 | 0.152698 | 0.150093 | 0.150355 | 0.183206 | 0.179507 | 0.179786 | -0.021395 | 1.000000 | 0.990884 | 0.981228 | 0.475542 | 0.471256 | 0.472674 | 0.005099 | 0.161373 | 0.162151 | 0.164334 | 0.378455 | 0.376991 | 0.380360 | 0.498729 | 0.500957 | 0.500756 | 0.511590 | 0.502198 | 0.512221 | 0.503788 | 0.505662 | 0.504540 | 0.220507 | 0.217760 | 0.217653 | 0.054186 | 0.076599 | 0.071351 | 0.493214 | 0.009260 | 0.005231 | 0.011826 | -0.001021 | 0.005569 | -0.011681 | 0.006480 | -0.003551 | -0.001748 | -0.003813 | -0.002895 | 0.021615 | 0.011375 | 0.005896 | 0.000430 | -0.000237 | 0.014274 | 0.004690 | -0.002390 | -0.004147 | -0.053101 | 0.004539 | -0.011408 | 0.003442 | -0.005515 | 0.046965 | 0.037945 | -0.013984 |
| LANDAREA_AVG | 0.002934 | 0.253077 | 0.255666 | 0.263642 | 0.063079 | 0.061601 | 0.061931 | 0.421164 | 0.417510 | 0.431860 | 0.150064 | 0.147504 | 0.147874 | 0.181211 | 0.178186 | 0.178408 | -0.021384 | 0.990884 | 1.000000 | 0.972972 | 0.470694 | 0.468224 | 0.469202 | 0.005084 | 0.160731 | 0.160244 | 0.162543 | 0.375876 | 0.375158 | 0.377513 | 0.495934 | 0.496410 | 0.496969 | 0.507562 | 0.497818 | 0.508848 | 0.501159 | 0.501364 | 0.501161 | 0.219714 | 0.216961 | 0.216819 | 0.053952 | 0.076331 | 0.071097 | 0.491015 | 0.009236 | 0.007634 | 0.012075 | -0.001104 | 0.005682 | -0.012393 | 0.006054 | -0.003694 | -0.001492 | -0.003964 | -0.002509 | 0.022506 | 0.011802 | 0.006374 | 0.000102 | 0.000591 | 0.014503 | 0.005175 | -0.002143 | -0.004457 | -0.051987 | 0.004210 | -0.011420 | 0.003438 | -0.005355 | 0.045123 | 0.036342 | -0.013539 |
| LANDAREA_MODE | 0.003035 | 0.240619 | 0.244055 | 0.262352 | 0.059149 | 0.057166 | 0.062180 | 0.415587 | 0.410504 | 0.438350 | 0.149188 | 0.139141 | 0.139709 | 0.177669 | 0.168604 | 0.168751 | -0.019544 | 0.981228 | 0.972972 | 1.000000 | 0.484460 | 0.464144 | 0.466461 | 0.003333 | 0.154759 | 0.168565 | 0.159405 | 0.364891 | 0.362521 | 0.380411 | 0.487670 | 0.508529 | 0.490293 | 0.511949 | 0.518244 | 0.511956 | 0.491731 | 0.513547 | 0.493367 | 0.212257 | 0.202091 | 0.202247 | 0.052933 | 0.072452 | 0.067231 | 0.479343 | 0.008100 | 0.005646 | 0.010784 | -0.000234 | 0.005862 | -0.010501 | 0.006728 | -0.002552 | -0.002895 | -0.002832 | -0.003729 | 0.017290 | 0.007835 | 0.001457 | 0.001572 | -0.000183 | 0.011613 | 0.001402 | -0.004020 | -0.003953 | -0.061096 | 0.004763 | -0.010425 | 0.004006 | -0.005961 | 0.058796 | 0.048524 | -0.012519 |
| BASEMENTAREA_MODE | -0.000687 | 0.383025 | 0.386034 | 0.402390 | 0.091922 | 0.093564 | 0.098268 | 0.629270 | 0.624144 | 0.653435 | 0.207001 | 0.199227 | 0.198881 | 0.243382 | 0.232960 | 0.233523 | -0.026777 | 0.475542 | 0.470694 | 0.484460 | 1.000000 | 0.975291 | 0.978262 | 0.033933 | 0.254772 | 0.270229 | 0.259540 | 0.539354 | 0.538022 | 0.552293 | 0.660366 | 0.678423 | 0.662998 | 0.651956 | 0.653784 | 0.652995 | 0.673436 | 0.690212 | 0.672895 | 0.308825 | 0.298168 | 0.297787 | 0.059862 | 0.083918 | 0.076458 | 0.648240 | 0.004110 | -0.002767 | 0.019158 | -0.000325 | 0.004118 | -0.011166 | -0.002863 | -0.010674 | -0.011675 | -0.011010 | -0.009459 | 0.037158 | 0.037724 | 0.036378 | -0.004981 | -0.005732 | 0.034527 | 0.033595 | 0.011618 | -0.009291 | 0.066314 | -0.002691 | -0.000176 | -0.018812 | -0.011839 | -0.032146 | -0.046738 | -0.021323 |
| BASEMENTAREA_AVG | -0.001218 | 0.401366 | 0.402608 | 0.400002 | 0.096333 | 0.096266 | 0.095715 | 0.650066 | 0.647704 | 0.648962 | 0.220236 | 0.222760 | 0.221369 | 0.248353 | 0.246902 | 0.247753 | -0.032436 | 0.471256 | 0.468224 | 0.464144 | 0.975291 | 1.000000 | 0.995783 | 0.039123 | 0.263714 | 0.258374 | 0.262911 | 0.561617 | 0.563952 | 0.554458 | 0.679918 | 0.667089 | 0.678909 | 0.647729 | 0.627240 | 0.651806 | 0.693521 | 0.678494 | 0.690215 | 0.328630 | 0.329492 | 0.327642 | 0.061477 | 0.089229 | 0.081782 | 0.673316 | 0.005423 | -0.002262 | 0.020907 | -0.001259 | 0.004760 | -0.012728 | -0.003567 | -0.015154 | -0.013251 | -0.015466 | -0.010879 | 0.047843 | 0.045509 | 0.046552 | -0.005527 | -0.006458 | 0.041399 | 0.041226 | 0.015454 | -0.009050 | 0.098987 | -0.002384 | -0.001224 | -0.020079 | -0.012849 | -0.061396 | -0.074168 | -0.023834 |
| BASEMENTAREA_MEDI | -0.001100 | 0.400316 | 0.402752 | 0.402012 | 0.095775 | 0.096361 | 0.096317 | 0.651839 | 0.646803 | 0.651692 | 0.217340 | 0.219260 | 0.218122 | 0.247039 | 0.244974 | 0.245664 | -0.031287 | 0.472674 | 0.469202 | 0.466461 | 0.978262 | 0.995783 | 1.000000 | 0.038267 | 0.262984 | 0.260387 | 0.264363 | 0.561737 | 0.561484 | 0.554872 | 0.678683 | 0.668909 | 0.680415 | 0.651333 | 0.631156 | 0.652724 | 0.692532 | 0.680624 | 0.691541 | 0.325050 | 0.325273 | 0.323735 | 0.060960 | 0.088566 | 0.081095 | 0.669533 | 0.005378 | -0.002666 | 0.021017 | -0.001086 | 0.005041 | -0.012201 | -0.003754 | -0.014443 | -0.012850 | -0.014743 | -0.010474 | 0.046458 | 0.043617 | 0.044472 | -0.005794 | -0.007090 | 0.040947 | 0.039395 | 0.014711 | -0.009238 | 0.094199 | -0.002358 | -0.001120 | -0.020656 | -0.013046 | -0.057318 | -0.070167 | -0.023122 |
| EXT_SOURCE_1 | -0.000098 | 0.032502 | 0.031532 | 0.027070 | 0.017335 | 0.016615 | 0.015245 | 0.043665 | 0.045369 | 0.038305 | 0.067297 | 0.070879 | 0.069767 | 0.013757 | 0.014411 | 0.014988 | -0.081396 | 0.005099 | 0.005084 | 0.003333 | 0.033933 | 0.039123 | 0.038267 | 1.000000 | 0.030153 | 0.024627 | 0.028812 | 0.070254 | 0.071731 | 0.066530 | 0.051169 | 0.045355 | 0.049499 | 0.019566 | 0.016316 | 0.020374 | 0.065437 | 0.060149 | 0.064241 | 0.086689 | 0.089897 | 0.088767 | -0.002320 | -0.001223 | -0.000907 | 0.063785 | 0.185211 | -0.002503 | 0.031976 | -0.006640 | -0.004104 | 0.005301 | -0.002403 | -0.026333 | -0.030973 | -0.026887 | -0.028715 | 0.213917 | 0.174615 | 0.119410 | -0.096102 | -0.130211 | 0.032487 | 0.167599 | 0.023251 | -0.138459 | 0.098941 | -0.598890 | 0.289068 | -0.178719 | -0.132527 | -0.113677 | -0.113373 | -0.155781 |
| NONLIVINGAREA_AVG | 0.002460 | 0.227503 | 0.227796 | 0.220478 | 0.217959 | 0.218406 | 0.212423 | 0.292987 | 0.292061 | 0.284769 | 0.147103 | 0.153013 | 0.152203 | 0.125128 | 0.127607 | 0.127629 | -0.032484 | 0.161373 | 0.160731 | 0.154759 | 0.254772 | 0.263714 | 0.262984 | 0.030153 | 1.000000 | 0.966617 | 0.990679 | 0.279937 | 0.282617 | 0.274017 | 0.298349 | 0.285919 | 0.295956 | 0.161830 | 0.155001 | 0.164597 | 0.300217 | 0.285037 | 0.296622 | 0.248043 | 0.253370 | 0.252478 | -0.008654 | 0.012008 | 0.013086 | 0.365713 | -0.002831 | -0.007740 | 0.012384 | 0.002492 | 0.001485 | -0.009466 | -0.002690 | -0.017470 | -0.013272 | -0.017583 | -0.013243 | 0.045519 | 0.044956 | 0.054684 | 0.004982 | -0.004054 | 0.044565 | 0.040894 | 0.077089 | 0.003166 | 0.076143 | 0.004914 | -0.014019 | 0.052079 | 0.001327 | -0.082002 | -0.082463 | -0.012034 |
| NONLIVINGAREA_MODE | 0.001556 | 0.215756 | 0.217994 | 0.227433 | 0.208738 | 0.211433 | 0.214904 | 0.276488 | 0.273629 | 0.287368 | 0.136329 | 0.131155 | 0.131145 | 0.117844 | 0.112985 | 0.112896 | -0.028765 | 0.162151 | 0.160244 | 0.168565 | 0.270229 | 0.258374 | 0.260387 | 0.024627 | 0.966617 | 1.000000 | 0.976036 | 0.264544 | 0.263841 | 0.273111 | 0.282916 | 0.292644 | 0.283986 | 0.169134 | 0.175061 | 0.169416 | 0.283470 | 0.293653 | 0.282900 | 0.231310 | 0.225307 | 0.225592 | -0.004004 | 0.010092 | 0.008034 | 0.345283 | -0.003298 | -0.007128 | 0.009529 | 0.002035 | 0.000482 | -0.007350 | -0.001136 | -0.013123 | -0.012010 | -0.013260 | -0.011674 | 0.037709 | 0.039309 | 0.045902 | 0.005223 | -0.004427 | 0.038635 | 0.035297 | 0.064064 | 0.002664 | 0.051838 | 0.004090 | -0.012948 | 0.049933 | 0.000026 | -0.059503 | -0.060918 | -0.010751 |
| NONLIVINGAREA_MEDI | 0.001669 | 0.227223 | 0.229016 | 0.224863 | 0.216835 | 0.218793 | 0.213502 | 0.292079 | 0.289557 | 0.286799 | 0.143140 | 0.146666 | 0.146481 | 0.124120 | 0.124902 | 0.124702 | -0.032077 | 0.164334 | 0.162543 | 0.159405 | 0.259540 | 0.262911 | 0.264363 | 0.028812 | 0.990679 | 0.976036 | 1.000000 | 0.279554 | 0.279643 | 0.274803 | 0.296336 | 0.288648 | 0.296514 | 0.164696 | 0.159249 | 0.165812 | 0.298327 | 0.288660 | 0.296795 | 0.243800 | 0.247064 | 0.247169 | -0.009434 | 0.010814 | 0.012117 | 0.360934 | -0.003549 | -0.007969 | 0.011464 | 0.001938 | 0.000551 | -0.008823 | -0.002270 | -0.015676 | -0.012892 | -0.015779 | -0.012713 | 0.043267 | 0.042839 | 0.051977 | 0.005025 | -0.004454 | 0.043368 | 0.038741 | 0.073302 | 0.003049 | 0.067917 | 0.005578 | -0.014081 | 0.052898 | 0.001627 | -0.075075 | -0.075988 | -0.011442 |
| ELEVATORS_MEDI | 0.005666 | 0.518537 | 0.520443 | 0.505053 | 0.121778 | 0.121835 | 0.116300 | 0.816285 | 0.811084 | 0.800515 | 0.500231 | 0.510074 | 0.509386 | 0.339740 | 0.342223 | 0.342611 | -0.065794 | 0.378455 | 0.375876 | 0.364891 | 0.539354 | 0.561617 | 0.561737 | 0.070254 | 0.279937 | 0.264544 | 0.279554 | 1.000000 | 0.995951 | 0.982569 | 0.834284 | 0.807492 | 0.836612 | 0.403318 | 0.378022 | 0.403711 | 0.865624 | 0.840316 | 0.868147 | 0.669392 | 0.676676 | 0.676016 | 0.073841 | 0.079690 | 0.078694 | 0.838507 | 0.006905 | -0.003133 | 0.040722 | 0.000570 | 0.002988 | -0.016773 | -0.004685 | -0.034902 | -0.023556 | -0.035381 | -0.022742 | 0.113715 | 0.083950 | 0.102732 | 0.000133 | -0.011752 | 0.105367 | 0.081052 | 0.039690 | -0.005835 | 0.275174 | -0.000223 | -0.008678 | 0.000790 | -0.010731 | -0.221633 | -0.233193 | -0.035791 |
| ELEVATORS_AVG | 0.005552 | 0.520095 | 0.520169 | 0.503870 | 0.121878 | 0.121147 | 0.115512 | 0.814531 | 0.813447 | 0.798802 | 0.500414 | 0.511838 | 0.509984 | 0.338728 | 0.341850 | 0.342693 | -0.066436 | 0.376991 | 0.375158 | 0.362521 | 0.538022 | 0.563952 | 0.561484 | 0.071731 | 0.282617 | 0.263841 | 0.279643 | 0.995951 | 1.000000 | 0.978454 | 0.836059 | 0.804616 | 0.833627 | 0.400296 | 0.374157 | 0.403905 | 0.867534 | 0.838036 | 0.865341 | 0.671129 | 0.680446 | 0.678042 | 0.073437 | 0.079682 | 0.078568 | 0.845008 | 0.006803 | -0.003030 | 0.040755 | 0.000927 | 0.003282 | -0.017063 | -0.005053 | -0.035805 | -0.024005 | -0.036295 | -0.023109 | 0.115388 | 0.085197 | 0.104433 | -0.000277 | -0.011418 | 0.106407 | 0.082385 | 0.040491 | -0.006032 | 0.281380 | -0.000371 | -0.008651 | -0.000080 | -0.010767 | -0.227037 | -0.238425 | -0.036381 |
| ELEVATORS_MODE | 0.005788 | 0.501695 | 0.503396 | 0.504118 | 0.114281 | 0.115012 | 0.115426 | 0.801272 | 0.795752 | 0.809301 | 0.496078 | 0.496145 | 0.495428 | 0.336509 | 0.333069 | 0.333591 | -0.061457 | 0.380360 | 0.377513 | 0.380411 | 0.552293 | 0.554458 | 0.554872 | 0.066530 | 0.274017 | 0.273111 | 0.274803 | 0.982569 | 0.978454 | 1.000000 | 0.821441 | 0.825110 | 0.824538 | 0.402745 | 0.401050 | 0.402711 | 0.851998 | 0.855616 | 0.855201 | 0.661175 | 0.656435 | 0.655700 | 0.077213 | 0.079978 | 0.079048 | 0.820160 | 0.006742 | -0.002594 | 0.038436 | 0.000721 | 0.003161 | -0.015946 | -0.004350 | -0.031987 | -0.023458 | -0.032493 | -0.022422 | 0.106503 | 0.079855 | 0.096243 | 0.001385 | -0.010413 | 0.099335 | 0.076927 | 0.036894 | -0.005723 | 0.252585 | -0.000107 | -0.008199 | 0.001957 | -0.010496 | -0.201538 | -0.213864 | -0.034306 |
| APARTMENTS_AVG | 0.001911 | 0.536826 | 0.537877 | 0.527508 | 0.196310 | 0.194906 | 0.189573 | 0.943828 | 0.945602 | 0.931941 | 0.437226 | 0.445280 | 0.443479 | 0.337593 | 0.338268 | 0.339399 | -0.051160 | 0.498729 | 0.495934 | 0.487670 | 0.660366 | 0.679918 | 0.678683 | 0.051169 | 0.298349 | 0.282916 | 0.296336 | 0.834284 | 0.836059 | 0.821441 | 1.000000 | 0.972824 | 0.995015 | 0.606101 | 0.581537 | 0.609724 | 0.914218 | 0.893655 | 0.913113 | 0.613988 | 0.618444 | 0.616228 | 0.096675 | 0.101424 | 0.100973 | 0.892090 | 0.003258 | -0.003478 | 0.034102 | 0.001789 | 0.004611 | -0.015733 | -0.002850 | -0.024016 | -0.016403 | -0.024522 | -0.013851 | 0.090343 | 0.067394 | 0.079158 | -0.010062 | -0.008792 | 0.083651 | 0.063280 | 0.031310 | -0.012330 | 0.206390 | 0.006776 | -0.017006 | 0.013472 | -0.006499 | -0.152610 | -0.172048 | -0.031644 |
| APARTMENTS_MODE | 0.002158 | 0.511312 | 0.514492 | 0.524077 | 0.181238 | 0.184346 | 0.186793 | 0.916230 | 0.909630 | 0.939327 | 0.424276 | 0.419621 | 0.418987 | 0.328968 | 0.321492 | 0.322318 | -0.045620 | 0.500957 | 0.496410 | 0.508529 | 0.678423 | 0.667089 | 0.668909 | 0.045355 | 0.285919 | 0.292644 | 0.288648 | 0.807492 | 0.804616 | 0.825110 | 0.972824 | 1.000000 | 0.976870 | 0.610962 | 0.614574 | 0.610929 | 0.890888 | 0.911286 | 0.894377 | 0.595375 | 0.585385 | 0.584368 | 0.101931 | 0.101663 | 0.100944 | 0.862033 | 0.002695 | -0.003112 | 0.031627 | 0.002413 | 0.004402 | -0.013849 | -0.002504 | -0.020334 | -0.016230 | -0.020814 | -0.013413 | 0.079769 | 0.059877 | 0.068941 | -0.008183 | -0.007959 | 0.074785 | 0.055799 | 0.027137 | -0.011344 | 0.175433 | 0.006669 | -0.015507 | 0.013142 | -0.006220 | -0.123397 | -0.143996 | -0.029427 |
| APARTMENTS_MEDI | 0.002255 | 0.536078 | 0.539034 | 0.529621 | 0.192000 | 0.193856 | 0.188568 | 0.944156 | 0.936836 | 0.933477 | 0.435556 | 0.442561 | 0.441581 | 0.337040 | 0.337094 | 0.337949 | -0.049992 | 0.500756 | 0.496969 | 0.490293 | 0.662998 | 0.678909 | 0.680415 | 0.049499 | 0.295956 | 0.283986 | 0.296514 | 0.836612 | 0.833627 | 0.824538 | 0.995015 | 0.976870 | 1.000000 | 0.610141 | 0.586601 | 0.610304 | 0.913205 | 0.896485 | 0.916740 | 0.612051 | 0.614861 | 0.613871 | 0.096835 | 0.101344 | 0.101165 | 0.886104 | 0.002998 | -0.003533 | 0.033987 | 0.001865 | 0.004587 | -0.015310 | -0.002789 | -0.023605 | -0.016286 | -0.024106 | -0.013698 | 0.088616 | 0.065768 | 0.077002 | -0.009991 | -0.009078 | 0.082497 | 0.061599 | 0.030678 | -0.012264 | 0.201838 | 0.006985 | -0.016718 | 0.013577 | -0.006452 | -0.148410 | -0.167831 | -0.031137 |
| ENTRANCES_MEDI | -0.002076 | 0.322824 | 0.325659 | 0.332168 | 0.061096 | 0.062836 | 0.065015 | 0.567007 | 0.561134 | 0.573767 | 0.034670 | 0.031725 | 0.030663 | 0.091311 | 0.085301 | 0.086125 | -0.016462 | 0.511590 | 0.507562 | 0.511949 | 0.651956 | 0.647729 | 0.651333 | 0.019566 | 0.161830 | 0.169134 | 0.164696 | 0.403318 | 0.400296 | 0.402745 | 0.606101 | 0.610962 | 0.610141 | 1.000000 | 0.980457 | 0.996902 | 0.615481 | 0.622494 | 0.619575 | 0.086672 | 0.083234 | 0.081517 | 0.037591 | 0.041857 | 0.040780 | 0.587397 | 0.008871 | -0.000142 | 0.013349 | -0.002721 | 0.006788 | -0.010360 | -0.000025 | 0.000122 | -0.004201 | -0.000143 | -0.000596 | 0.031061 | 0.017277 | 0.012701 | -0.003046 | -0.012220 | 0.021172 | 0.013505 | 0.004576 | -0.006975 | 0.033167 | -0.008534 | 0.002773 | -0.062268 | -0.013221 | -0.021531 | -0.028446 | -0.020116 |
| ENTRANCES_MODE | -0.002179 | 0.299515 | 0.302568 | 0.321489 | 0.052706 | 0.055111 | 0.061913 | 0.537489 | 0.531457 | 0.566713 | 0.028875 | 0.016065 | 0.015810 | 0.085704 | 0.072267 | 0.072993 | -0.012329 | 0.502198 | 0.497818 | 0.518244 | 0.653784 | 0.627240 | 0.631156 | 0.016316 | 0.155001 | 0.175061 | 0.159249 | 0.378022 | 0.374157 | 0.401050 | 0.581537 | 0.614574 | 0.586601 | 0.980457 | 1.000000 | 0.977574 | 0.590724 | 0.623561 | 0.595389 | 0.076702 | 0.061508 | 0.060785 | 0.036312 | 0.038011 | 0.036758 | 0.559452 | 0.008033 | -0.000065 | 0.011055 | -0.002112 | 0.005709 | -0.008534 | 0.000241 | 0.002181 | -0.004933 | 0.001942 | -0.001354 | 0.023618 | 0.013022 | 0.006746 | -0.001079 | -0.011140 | 0.016610 | 0.009157 | 0.002139 | -0.005575 | 0.015755 | -0.008220 | 0.003498 | -0.059319 | -0.012944 | -0.004438 | -0.012132 | -0.018407 |
| ENTRANCES_AVG | -0.002377 | 0.325433 | 0.327092 | 0.332897 | 0.061623 | 0.062890 | 0.064713 | 0.568360 | 0.565461 | 0.575143 | 0.037452 | 0.034497 | 0.033887 | 0.092882 | 0.087948 | 0.088597 | -0.017163 | 0.512221 | 0.508848 | 0.511956 | 0.652995 | 0.651806 | 0.652724 | 0.020374 | 0.164597 | 0.169416 | 0.165812 | 0.403711 | 0.403905 | 0.402711 | 0.609724 | 0.610929 | 0.610304 | 0.996902 | 0.977574 | 1.000000 | 0.619383 | 0.623247 | 0.620071 | 0.091075 | 0.087422 | 0.086365 | 0.038050 | 0.042632 | 0.041513 | 0.594085 | 0.009025 | 0.000257 | 0.013035 | -0.002917 | 0.006717 | -0.010426 | -0.000196 | -0.000307 | -0.004412 | -0.000566 | -0.000895 | 0.032358 | 0.018333 | 0.014063 | -0.002855 | -0.012022 | 0.021492 | 0.014622 | 0.005134 | -0.006867 | 0.036256 | -0.008986 | 0.002734 | -0.062525 | -0.013075 | -0.023626 | -0.030790 | -0.020484 |
| LIVINGAREA_AVG | 0.003940 | 0.544066 | 0.545263 | 0.534595 | 0.136229 | 0.136504 | 0.131247 | 0.884652 | 0.881894 | 0.874249 | 0.458830 | 0.467477 | 0.465766 | 0.352223 | 0.353487 | 0.354490 | -0.059801 | 0.503788 | 0.501159 | 0.491731 | 0.673436 | 0.693521 | 0.692532 | 0.065437 | 0.300217 | 0.283470 | 0.298327 | 0.865624 | 0.867534 | 0.851998 | 0.914218 | 0.890888 | 0.913205 | 0.615481 | 0.590724 | 0.619383 | 1.000000 | 0.971389 | 0.995427 | 0.625755 | 0.630360 | 0.628319 | 0.078552 | 0.095967 | 0.092702 | 0.926029 | 0.003755 | -0.004576 | 0.034536 | 0.001753 | 0.005269 | -0.018721 | -0.002850 | -0.026816 | -0.017731 | -0.027248 | -0.015859 | 0.096877 | 0.078335 | 0.091897 | -0.003996 | -0.011096 | 0.084724 | 0.073658 | 0.035924 | -0.009387 | 0.214648 | 0.001366 | -0.012905 | 0.007223 | -0.010633 | -0.164884 | -0.183204 | -0.035242 |
| LIVINGAREA_MODE | 0.004250 | 0.519428 | 0.522608 | 0.533735 | 0.127699 | 0.129633 | 0.132183 | 0.858999 | 0.852971 | 0.879962 | 0.444623 | 0.440933 | 0.440049 | 0.344393 | 0.337070 | 0.337875 | -0.054950 | 0.505662 | 0.501364 | 0.513547 | 0.690212 | 0.678494 | 0.680624 | 0.060149 | 0.285037 | 0.293653 | 0.288660 | 0.840316 | 0.838036 | 0.855616 | 0.893655 | 0.911286 | 0.896485 | 0.622494 | 0.623561 | 0.623247 | 0.971389 | 1.000000 | 0.974366 | 0.605886 | 0.596739 | 0.595832 | 0.077018 | 0.092394 | 0.088537 | 0.899386 | 0.003765 | -0.003888 | 0.031676 | 0.002418 | 0.005215 | -0.017109 | -0.001990 | -0.021698 | -0.016989 | -0.022150 | -0.014828 | 0.085227 | 0.070392 | 0.081540 | -0.002208 | -0.009979 | 0.075350 | 0.065696 | 0.031260 | -0.008615 | 0.182047 | 0.001559 | -0.011724 | 0.008007 | -0.010999 | -0.133665 | -0.153244 | -0.032972 |
| LIVINGAREA_MEDI | 0.004374 | 0.542972 | 0.545882 | 0.536113 | 0.135458 | 0.136379 | 0.131377 | 0.886539 | 0.879724 | 0.875901 | 0.457790 | 0.465169 | 0.464294 | 0.352294 | 0.352896 | 0.353607 | -0.058606 | 0.504540 | 0.501161 | 0.493367 | 0.672895 | 0.690215 | 0.691541 | 0.064241 | 0.296622 | 0.282900 | 0.296795 | 0.868147 | 0.865341 | 0.855201 | 0.913113 | 0.894377 | 0.916740 | 0.619575 | 0.595389 | 0.620071 | 0.995427 | 0.974366 | 1.000000 | 0.623908 | 0.626875 | 0.626008 | 0.078318 | 0.095415 | 0.092450 | 0.920828 | 0.003507 | -0.004357 | 0.034514 | 0.001925 | 0.005444 | -0.018801 | -0.002907 | -0.026049 | -0.017193 | -0.026479 | -0.015416 | 0.095325 | 0.077267 | 0.090548 | -0.003830 | -0.011248 | 0.083559 | 0.072571 | 0.035275 | -0.009594 | 0.210470 | 0.001903 | -0.013176 | 0.007687 | -0.010498 | -0.161042 | -0.179341 | -0.034857 |
| FLOORSMAX_MODE | 0.005201 | 0.395279 | 0.394667 | 0.378377 | 0.108526 | 0.107229 | 0.101441 | 0.584335 | 0.584088 | 0.573404 | 0.727696 | 0.730044 | 0.730901 | 0.510358 | 0.511288 | 0.511258 | -0.080548 | 0.220507 | 0.219714 | 0.212257 | 0.308825 | 0.328630 | 0.325050 | 0.086689 | 0.248043 | 0.231310 | 0.243800 | 0.669392 | 0.671129 | 0.661175 | 0.613988 | 0.595375 | 0.612051 | 0.086672 | 0.076702 | 0.091075 | 0.625755 | 0.605886 | 0.623908 | 1.000000 | 0.985669 | 0.988201 | 0.109787 | 0.130294 | 0.128123 | 0.626085 | 0.003030 | -0.003126 | 0.041317 | 0.001317 | 0.001660 | -0.018166 | 0.000556 | -0.039061 | -0.030220 | -0.039367 | -0.030381 | 0.129425 | 0.105551 | 0.128378 | -0.001571 | -0.006165 | 0.113995 | 0.100857 | 0.052066 | -0.009677 | 0.303690 | 0.001685 | -0.014106 | 0.049158 | -0.011584 | -0.219861 | -0.237230 | -0.045368 |
| FLOORSMAX_AVG | 0.005760 | 0.401736 | 0.400655 | 0.376467 | 0.113893 | 0.111649 | 0.102779 | 0.590479 | 0.591459 | 0.569560 | 0.723655 | 0.743030 | 0.740699 | 0.508150 | 0.517014 | 0.518305 | -0.082869 | 0.217760 | 0.216961 | 0.202091 | 0.298168 | 0.329492 | 0.325273 | 0.089897 | 0.253370 | 0.225307 | 0.247064 | 0.676676 | 0.680446 | 0.656435 | 0.618444 | 0.585385 | 0.614861 | 0.083234 | 0.061508 | 0.087422 | 0.630360 | 0.596739 | 0.626875 | 0.985669 | 1.000000 | 0.997059 | 0.107363 | 0.131014 | 0.129041 | 0.633646 | 0.002101 | -0.003560 | 0.043776 | 0.001105 | 0.002235 | -0.018978 | -0.000114 | -0.040739 | -0.030484 | -0.041030 | -0.030619 | 0.135144 | 0.108699 | 0.132397 | -0.002280 | -0.006622 | 0.119406 | 0.103899 | 0.054379 | -0.009643 | 0.322096 | 0.002227 | -0.014993 | 0.049425 | -0.011297 | -0.235021 | -0.251429 | -0.046041 |
| FLOORSMAX_MEDI | 0.005355 | 0.400223 | 0.399626 | 0.375441 | 0.112877 | 0.111432 | 0.102801 | 0.588101 | 0.588124 | 0.567508 | 0.724492 | 0.740669 | 0.741322 | 0.508380 | 0.517300 | 0.517313 | -0.082545 | 0.217653 | 0.216819 | 0.202247 | 0.297787 | 0.327642 | 0.323735 | 0.088767 | 0.252478 | 0.225592 | 0.247169 | 0.676016 | 0.678042 | 0.655700 | 0.616228 | 0.584368 | 0.613871 | 0.081517 | 0.060785 | 0.086365 | 0.628319 | 0.595832 | 0.626008 | 0.988201 | 0.997059 | 1.000000 | 0.107193 | 0.130742 | 0.129143 | 0.630983 | 0.002460 | -0.003372 | 0.043082 | 0.001254 | 0.002082 | -0.019034 | 0.000021 | -0.040378 | -0.030446 | -0.040664 | -0.030693 | 0.133912 | 0.108049 | 0.131363 | -0.001868 | -0.006550 | 0.118014 | 0.103290 | 0.053956 | -0.009383 | 0.317838 | 0.002280 | -0.015050 | 0.049661 | -0.011444 | -0.231637 | -0.248063 | -0.045861 |
| YEARS_BEGINEXPLUATATION_MODE | 0.002445 | 0.050956 | 0.051044 | 0.049195 | 0.020760 | 0.020289 | 0.019254 | 0.088925 | 0.088665 | 0.087476 | 0.100572 | 0.101034 | 0.100881 | 0.302129 | 0.299885 | 0.299906 | 0.001837 | 0.054186 | 0.053952 | 0.052933 | 0.059862 | 0.061477 | 0.060960 | -0.002320 | -0.008654 | -0.004004 | -0.009434 | 0.073841 | 0.073437 | 0.077213 | 0.096675 | 0.101931 | 0.096835 | 0.037591 | 0.036312 | 0.038050 | 0.078552 | 0.077018 | 0.078318 | 0.109787 | 0.107363 | 0.107193 | 1.000000 | 0.972994 | 0.966071 | 0.099119 | -0.002873 | 0.002492 | -0.000614 | 0.003927 | -0.000412 | -0.007690 | 0.001574 | -0.000131 | -0.003751 | -0.000038 | -0.003759 | 0.007867 | 0.006882 | 0.015115 | 0.007266 | 0.001918 | -0.011315 | 0.005819 | 0.005204 | 0.006001 | -0.006707 | 0.001740 | 0.008376 | 0.010382 | -0.001100 | 0.004547 | -0.000838 | -0.009553 |
| YEARS_BEGINEXPLUATATION_AVG | 0.002513 | 0.095025 | 0.095260 | 0.090068 | 0.035872 | 0.034919 | 0.032312 | 0.153387 | 0.152964 | 0.148304 | 0.168074 | 0.172300 | 0.171914 | 0.492266 | 0.497321 | 0.497986 | -0.000012 | 0.076599 | 0.076331 | 0.072452 | 0.083918 | 0.089229 | 0.088566 | -0.001223 | 0.012008 | 0.010092 | 0.010814 | 0.079690 | 0.079682 | 0.079978 | 0.101424 | 0.101663 | 0.101344 | 0.041857 | 0.038011 | 0.042632 | 0.095967 | 0.092394 | 0.095415 | 0.130294 | 0.131014 | 0.130742 | 0.972994 | 1.000000 | 0.994221 | 0.101522 | -0.003130 | 0.003277 | -0.001142 | 0.003716 | 0.000386 | -0.008031 | 0.001664 | -0.000455 | -0.005138 | -0.000371 | -0.005337 | 0.008709 | 0.008124 | 0.015545 | 0.007922 | 0.003131 | -0.010619 | 0.007028 | 0.005564 | 0.006926 | -0.006570 | 0.002015 | 0.008846 | 0.012817 | -0.002182 | 0.004508 | -0.000733 | -0.010557 |
| YEARS_BEGINEXPLUATATION_MEDI | 0.002298 | 0.078857 | 0.079089 | 0.074009 | 0.032569 | 0.031826 | 0.029473 | 0.131092 | 0.130738 | 0.126255 | 0.148876 | 0.152133 | 0.152238 | 0.438762 | 0.443892 | 0.443345 | 0.000043 | 0.071351 | 0.071097 | 0.067231 | 0.076458 | 0.081782 | 0.081095 | -0.000907 | 0.013086 | 0.008034 | 0.012117 | 0.078694 | 0.078568 | 0.079048 | 0.100973 | 0.100944 | 0.101165 | 0.040780 | 0.036758 | 0.041513 | 0.092702 | 0.088537 | 0.092450 | 0.128123 | 0.129041 | 0.129143 | 0.966071 | 0.994221 | 1.000000 | 0.100343 | -0.002702 | 0.002431 | -0.000934 | 0.003707 | 0.000400 | -0.007979 | 0.001780 | -0.000366 | -0.005232 | -0.000286 | -0.005390 | 0.008639 | 0.007677 | 0.015242 | 0.007665 | 0.003533 | -0.010170 | 0.006500 | 0.005571 | 0.006571 | -0.006542 | 0.002051 | 0.008620 | 0.012831 | -0.001754 | 0.004512 | -0.000644 | -0.010934 |
| TOTALAREA_MODE | 0.003307 | 0.550656 | 0.550483 | 0.541181 | 0.144837 | 0.144587 | 0.139331 | 0.847531 | 0.849248 | 0.834733 | 0.446324 | 0.456486 | 0.454403 | 0.355397 | 0.357755 | 0.359051 | -0.061077 | 0.493214 | 0.491015 | 0.479343 | 0.648240 | 0.673316 | 0.669533 | 0.063785 | 0.365713 | 0.345283 | 0.360934 | 0.838507 | 0.845008 | 0.820160 | 0.892090 | 0.862033 | 0.886104 | 0.587397 | 0.559452 | 0.594085 | 0.926029 | 0.899386 | 0.920828 | 0.626085 | 0.633646 | 0.630983 | 0.099119 | 0.101522 | 0.100343 | 1.000000 | 0.004567 | -0.003647 | 0.033840 | 0.002358 | 0.005557 | -0.018790 | -0.003923 | -0.027016 | -0.018859 | -0.027462 | -0.017370 | 0.094737 | 0.078645 | 0.092692 | -0.001398 | -0.008522 | 0.080780 | 0.074399 | 0.037922 | -0.006763 | 0.203455 | 0.002688 | -0.014987 | 0.019829 | -0.010000 | -0.161519 | -0.178946 | -0.035540 |
| EXT_SOURCE_3 | -0.000007 | -0.005499 | -0.005625 | -0.004424 | 0.009442 | 0.008861 | 0.008848 | 0.000900 | 0.001055 | 0.001906 | 0.003778 | 0.002409 | 0.002280 | 0.014674 | 0.015024 | 0.015181 | -0.013837 | 0.009260 | 0.009236 | 0.008100 | 0.004110 | 0.005423 | 0.005378 | 0.185211 | -0.002831 | -0.003298 | -0.003549 | 0.006905 | 0.006803 | 0.006742 | 0.003258 | 0.002695 | 0.002998 | 0.008871 | 0.008033 | 0.009025 | 0.003755 | 0.003765 | 0.003507 | 0.003030 | 0.002101 | 0.002460 | -0.002873 | -0.003130 | -0.002702 | 0.004567 | 1.000000 | -0.020485 | -0.008664 | -0.001117 | -0.008654 | -0.072853 | -0.023523 | -0.000080 | -0.034924 | 0.000248 | -0.038208 | 0.109183 | 0.047128 | 0.029045 | -0.029311 | -0.075542 | -0.040533 | 0.043049 | -0.029240 | -0.043570 | -0.006362 | -0.206463 | 0.114225 | -0.106684 | -0.131930 | -0.012732 | -0.012105 | -0.180865 |
| AMT_REQ_CREDIT_BUREAU_WEEK | 0.001299 | -0.009497 | -0.009552 | -0.008405 | -0.003654 | -0.003997 | -0.004205 | -0.007432 | -0.007485 | -0.007015 | -0.001291 | -0.001575 | -0.000978 | -0.006569 | -0.006244 | -0.006283 | 0.003276 | 0.005231 | 0.007634 | 0.005646 | -0.002767 | -0.002262 | -0.002666 | -0.002503 | -0.007740 | -0.007128 | -0.007969 | -0.003133 | -0.003030 | -0.002594 | -0.003478 | -0.003112 | -0.003533 | -0.000142 | -0.000065 | 0.000257 | -0.004576 | -0.003888 | -0.004357 | -0.003126 | -0.003560 | -0.003372 | 0.002492 | 0.003277 | 0.002431 | -0.003647 | -0.020485 | 1.000000 | -0.014782 | 0.004792 | 0.221089 | 0.016939 | -0.014195 | -0.001789 | -0.003369 | -0.001919 | -0.003194 | 0.001740 | -0.001594 | 0.013018 | -0.002436 | -0.002318 | -0.004517 | -0.001802 | 0.001770 | -0.003201 | -0.003104 | -0.000823 | 0.002864 | -0.001097 | -0.002042 | 0.003056 | 0.002039 | -0.001428 |
| AMT_REQ_CREDIT_BUREAU_MON | 0.000227 | 0.022451 | 0.022149 | 0.019809 | -0.000560 | -0.000965 | -0.001375 | 0.032529 | 0.032595 | 0.030218 | 0.035653 | 0.039477 | 0.038721 | -0.004297 | -0.004164 | -0.004172 | -0.022521 | 0.011826 | 0.012075 | 0.010784 | 0.019158 | 0.020907 | 0.021017 | 0.031976 | 0.012384 | 0.009529 | 0.011464 | 0.040722 | 0.040755 | 0.038436 | 0.034102 | 0.031627 | 0.033987 | 0.013349 | 0.011055 | 0.013035 | 0.034536 | 0.031676 | 0.034514 | 0.041317 | 0.043776 | 0.043082 | -0.000614 | -0.001142 | -0.000934 | 0.033840 | -0.008664 | -0.014782 | 1.000000 | -0.000423 | -0.006517 | -0.005589 | -0.008322 | 0.000739 | -0.003774 | 0.000688 | -0.000706 | 0.052036 | 0.056476 | 0.038745 | -0.007124 | -0.041114 | 0.036501 | 0.054457 | 0.022868 | -0.009941 | 0.078099 | 0.003435 | -0.035039 | -0.010973 | -0.008832 | -0.069076 | -0.067108 | -0.012376 |
| AMT_REQ_CREDIT_BUREAU_HOUR | -0.002844 | 0.006416 | 0.006569 | 0.006513 | 0.000469 | 0.000675 | -0.000420 | 0.002651 | 0.002833 | 0.003853 | 0.003737 | 0.003833 | 0.003881 | 0.001198 | 0.001142 | 0.001230 | 0.003907 | -0.001021 | -0.001104 | -0.000234 | -0.000325 | -0.001259 | -0.001086 | -0.006640 | 0.002492 | 0.002035 | 0.001938 | 0.000570 | 0.000927 | 0.000721 | 0.001789 | 0.002413 | 0.001865 | -0.002721 | -0.002112 | -0.002917 | 0.001753 | 0.002418 | 0.001925 | 0.001317 | 0.001105 | 0.001254 | 0.003927 | 0.003716 | 0.003707 | 0.002358 | -0.001117 | 0.004792 | -0.000423 | 1.000000 | 0.219818 | -0.004533 | -0.003131 | -0.000042 | -0.004294 | 0.000002 | -0.002580 | -0.003003 | -0.003191 | 0.003610 | 0.000645 | -0.000615 | -0.017674 | -0.003724 | 0.000290 | -0.000417 | -0.003025 | 0.003899 | -0.003969 | -0.001868 | 0.004427 | 0.006634 | 0.006760 | -0.000547 |
| AMT_REQ_CREDIT_BUREAU_DAY | -0.001018 | -0.000265 | -0.000085 | 0.000204 | -0.001643 | -0.001680 | -0.001305 | 0.003484 | 0.003390 | 0.003741 | 0.003338 | 0.003686 | 0.003681 | 0.001962 | 0.003460 | 0.003057 | -0.006480 | 0.005569 | 0.005682 | 0.005862 | 0.004118 | 0.004760 | 0.005041 | -0.004104 | 0.001485 | 0.000482 | 0.000551 | 0.002988 | 0.003282 | 0.003161 | 0.004611 | 0.004402 | 0.004587 | 0.006788 | 0.005709 | 0.006717 | 0.005269 | 0.005215 | 0.005444 | 0.001660 | 0.002235 | 0.002082 | -0.000412 | 0.000386 | 0.000400 | 0.005557 | -0.008654 | 0.221089 | -0.006517 | 0.219818 | 1.000000 | -0.003451 | -0.004329 | -0.002258 | -0.002209 | -0.002236 | -0.001373 | -0.000246 | 0.004451 | 0.001429 | -0.000485 | 0.002352 | 0.000075 | 0.004057 | 0.002500 | 0.000581 | 0.001361 | 0.002007 | 0.001232 | -0.000931 | -0.002177 | -0.001510 | -0.001322 | 0.000813 |
| AMT_REQ_CREDIT_BUREAU_YEAR | 0.004930 | -0.014661 | -0.014401 | -0.013372 | 0.001379 | 0.001970 | 0.002258 | -0.013095 | -0.012730 | -0.012366 | -0.008855 | -0.010269 | -0.010540 | -0.020694 | -0.021299 | -0.021440 | -0.015641 | -0.011681 | -0.012393 | -0.010501 | -0.011166 | -0.012728 | -0.012201 | 0.005301 | -0.009466 | -0.007350 | -0.008823 | -0.016773 | -0.017063 | -0.015946 | -0.015733 | -0.013849 | -0.015310 | -0.010360 | -0.008534 | -0.010426 | -0.018721 | -0.017109 | -0.018801 | -0.018166 | -0.018978 | -0.019034 | -0.007690 | -0.008031 | -0.007979 | -0.018790 | -0.072853 | 0.016939 | -0.005589 | -0.004533 | -0.003451 | 1.000000 | 0.073030 | 0.034751 | 0.016694 | 0.034265 | 0.019272 | -0.022484 | -0.051730 | -0.011349 | -0.028808 | -0.113448 | -0.030689 | -0.049236 | 0.010620 | -0.041786 | 0.002898 | -0.072728 | 0.049800 | -0.025366 | -0.034662 | 0.010981 | 0.010322 | 0.018896 |
| AMT_REQ_CREDIT_BUREAU_QRT | -0.000050 | -0.010515 | -0.010050 | -0.009280 | 0.002805 | 0.003295 | 0.003143 | -0.008347 | -0.008789 | -0.008217 | -0.004238 | -0.004978 | -0.004967 | -0.006423 | -0.007438 | -0.007304 | -0.017527 | 0.006480 | 0.006054 | 0.006728 | -0.002863 | -0.003567 | -0.003754 | -0.002403 | -0.002690 | -0.001136 | -0.002270 | -0.004685 | -0.005053 | -0.004350 | -0.002850 | -0.002504 | -0.002789 | -0.000025 | 0.000241 | -0.000196 | -0.002850 | -0.001990 | -0.002907 | 0.000556 | -0.000114 | 0.000021 | 0.001574 | 0.001664 | 0.001780 | -0.003923 | -0.023523 | -0.014195 | -0.008322 | -0.003131 | -0.004329 | 0.073030 | 1.000000 | 0.004368 | -0.000078 | 0.004627 | -0.000950 | -0.003633 | 0.015635 | 0.009594 | -0.005218 | -0.002055 | -0.000416 | 0.015057 | 0.004531 | -0.008286 | -0.000677 | -0.011702 | 0.014332 | -0.000095 | -0.007338 | 0.005321 | 0.004850 | -0.002230 |
| OBS_60_CNT_SOCIAL_CIRCLE | -0.001489 | -0.020677 | -0.020014 | -0.016636 | -0.001056 | -0.000561 | -0.000231 | -0.028310 | -0.028427 | -0.025142 | -0.035979 | -0.038168 | -0.037967 | 0.001401 | 0.000646 | 0.000507 | 0.005161 | -0.003551 | -0.003694 | -0.002552 | -0.010674 | -0.015154 | -0.014443 | -0.026333 | -0.017470 | -0.013123 | -0.015676 | -0.034902 | -0.035805 | -0.031987 | -0.024016 | -0.020334 | -0.023605 | 0.000122 | 0.002181 | -0.000307 | -0.026816 | -0.021698 | -0.026049 | -0.039061 | -0.040739 | -0.040378 | -0.000131 | -0.000455 | -0.000366 | -0.027016 | -0.000080 | -0.001789 | 0.000739 | -0.000042 | -0.002258 | 0.034751 | 0.004368 | 1.000000 | 0.234584 | 0.998362 | 0.308842 | -0.019123 | 0.001816 | -0.010986 | 0.025977 | -0.015177 | -0.010677 | 0.001722 | -0.012351 | 0.015323 | -0.010509 | 0.006292 | 0.006044 | 0.009425 | -0.012644 | 0.034230 | 0.029777 | 0.009144 |
| DEF_60_CNT_SOCIAL_CIRCLE | 0.000678 | -0.014209 | -0.013928 | -0.013215 | -0.001319 | -0.000911 | -0.000246 | -0.016995 | -0.017006 | -0.017523 | -0.022872 | -0.023657 | -0.023556 | -0.011099 | -0.011636 | -0.011478 | 0.011677 | -0.001748 | -0.001492 | -0.002895 | -0.011675 | -0.013251 | -0.012850 | -0.030973 | -0.013272 | -0.012010 | -0.012892 | -0.023556 | -0.024005 | -0.023458 | -0.016403 | -0.016230 | -0.016286 | -0.004201 | -0.004933 | -0.004412 | -0.017731 | -0.016989 | -0.017193 | -0.030220 | -0.030484 | -0.030446 | -0.003751 | -0.005138 | -0.005232 | -0.018859 | -0.034924 | -0.003369 | -0.003774 | -0.004294 | -0.002209 | 0.016694 | -0.000078 | 0.234584 | 1.000000 | 0.232368 | 0.859132 | -0.033888 | -0.023002 | -0.023382 | -0.005347 | 0.002201 | -0.009769 | -0.022172 | -0.012178 | -0.003045 | 0.001552 | 0.001259 | 0.014949 | 0.004320 | 0.004500 | 0.017643 | 0.016739 | 0.029870 |
| OBS_30_CNT_SOCIAL_CIRCLE | -0.001404 | -0.021039 | -0.020368 | -0.016998 | -0.001377 | -0.000880 | -0.000553 | -0.028816 | -0.028928 | -0.025629 | -0.036522 | -0.038671 | -0.038457 | 0.001537 | 0.000839 | 0.000709 | 0.005222 | -0.003813 | -0.003964 | -0.002832 | -0.011010 | -0.015466 | -0.014743 | -0.026887 | -0.017583 | -0.013260 | -0.015779 | -0.035381 | -0.036295 | -0.032493 | -0.024522 | -0.020814 | -0.024106 | -0.000143 | 0.001942 | -0.000566 | -0.027248 | -0.022150 | -0.026479 | -0.039367 | -0.041030 | -0.040664 | -0.000038 | -0.000371 | -0.000286 | -0.027462 | 0.000248 | -0.001919 | 0.000688 | 0.000002 | -0.002236 | 0.034265 | 0.004627 | 0.998362 | 0.232368 | 1.000000 | 0.306435 | -0.019501 | 0.001799 | -0.011256 | 0.026342 | -0.014661 | -0.010689 | 0.001677 | -0.012438 | 0.015670 | -0.010980 | 0.006664 | 0.005798 | 0.009426 | -0.012238 | 0.034598 | 0.030115 | 0.009272 |
| DEF_30_CNT_SOCIAL_CIRCLE | -0.000575 | -0.012428 | -0.012346 | -0.011801 | 0.001349 | 0.001888 | 0.003069 | -0.015635 | -0.015667 | -0.016124 | -0.025390 | -0.026169 | -0.026158 | -0.010162 | -0.010555 | -0.010424 | 0.007421 | -0.002895 | -0.002509 | -0.003729 | -0.009459 | -0.010879 | -0.010474 | -0.028715 | -0.013243 | -0.011674 | -0.012713 | -0.022742 | -0.023109 | -0.022422 | -0.013851 | -0.013413 | -0.013698 | -0.000596 | -0.001354 | -0.000895 | -0.015859 | -0.014828 | -0.015416 | -0.030381 | -0.030619 | -0.030693 | -0.003759 | -0.005337 | -0.005390 | -0.017370 | -0.038208 | -0.003194 | -0.000706 | -0.002580 | -0.001373 | 0.019272 | -0.000950 | 0.308842 | 0.859132 | 0.306435 | 1.000000 | -0.032222 | -0.020983 | -0.022416 | -0.002822 | 0.000701 | -0.006368 | -0.019980 | -0.012462 | -0.001948 | 0.006005 | -0.000538 | 0.017882 | 0.002464 | 0.002850 | 0.015480 | 0.014089 | 0.031837 |
| EXT_SOURCE_2 | 0.001123 | 0.053179 | 0.051516 | 0.043665 | 0.019233 | 0.018113 | 0.016875 | 0.078604 | 0.080303 | 0.071318 | 0.106986 | 0.112450 | 0.111551 | 0.007695 | 0.010393 | 0.010791 | -0.081239 | 0.021615 | 0.022506 | 0.017290 | 0.037158 | 0.047843 | 0.046458 | 0.213917 | 0.045519 | 0.037709 | 0.043267 | 0.113715 | 0.115388 | 0.106503 | 0.090343 | 0.079769 | 0.088616 | 0.031061 | 0.023618 | 0.032358 | 0.096877 | 0.085227 | 0.095325 | 0.129425 | 0.135144 | 0.133912 | 0.007867 | 0.008709 | 0.008639 | 0.094737 | 0.109183 | 0.001740 | 0.052036 | -0.003003 | -0.000246 | -0.022484 | -0.003633 | -0.019123 | -0.033888 | -0.019501 | -0.032222 | 1.000000 | 0.139108 | 0.125559 | -0.001857 | -0.195827 | 0.156600 | 0.131146 | 0.054966 | -0.017545 | 0.198794 | -0.091607 | -0.019670 | -0.058838 | -0.050631 | -0.291729 | -0.287190 | -0.159698 |
| AMT_GOODS_PRICE | 0.000227 | 0.049932 | 0.048917 | 0.041974 | 0.014541 | 0.013412 | 0.010851 | 0.061198 | 0.062989 | 0.054533 | 0.076515 | 0.080338 | 0.079628 | 0.038318 | 0.039981 | 0.040326 | -0.106258 | 0.011375 | 0.011802 | 0.007835 | 0.037724 | 0.045509 | 0.043617 | 0.174615 | 0.044956 | 0.039309 | 0.042839 | 0.083950 | 0.085197 | 0.079855 | 0.067394 | 0.059877 | 0.065768 | 0.017277 | 0.013022 | 0.018333 | 0.078335 | 0.070392 | 0.077267 | 0.105551 | 0.108699 | 0.108049 | 0.006882 | 0.008124 | 0.007677 | 0.078645 | 0.047128 | -0.001594 | 0.056476 | -0.003191 | 0.004451 | -0.051730 | 0.015635 | 0.001816 | -0.023002 | 0.001799 | -0.020983 | 0.139108 | 1.000000 | 0.774414 | 0.060464 | -0.076893 | 0.062811 | 0.986998 | 0.146114 | -0.002337 | 0.105018 | -0.053663 | -0.064092 | 0.012095 | -0.008840 | -0.104647 | -0.113207 | -0.039304 |
| AMT_ANNUITY | -0.000003 | 0.056695 | 0.055852 | 0.047572 | 0.022276 | 0.021405 | 0.017211 | 0.074110 | 0.076515 | 0.065992 | 0.094729 | 0.100250 | 0.098972 | 0.030641 | 0.032850 | 0.033351 | -0.099371 | 0.005896 | 0.006374 | 0.001457 | 0.036378 | 0.046552 | 0.044472 | 0.119410 | 0.054684 | 0.045902 | 0.051977 | 0.102732 | 0.104433 | 0.096243 | 0.079158 | 0.068941 | 0.077002 | 0.012701 | 0.006746 | 0.014063 | 0.091897 | 0.081540 | 0.090548 | 0.128378 | 0.132397 | 0.131363 | 0.015115 | 0.015545 | 0.015242 | 0.092692 | 0.029045 | 0.013018 | 0.038745 | 0.003610 | 0.001429 | -0.011349 | 0.009594 | -0.010986 | -0.023382 | -0.011256 | -0.022416 | 0.125559 | 0.774414 | 1.000000 | 0.075081 | -0.064906 | 0.053074 | 0.769449 | 0.175849 | 0.020850 | 0.119916 | 0.008731 | -0.103850 | 0.038813 | 0.011894 | -0.129451 | -0.143008 | -0.012715 |
| CNT_FAM_MEMBERS | -0.002231 | 0.000262 | 0.000731 | 0.000838 | 0.002755 | 0.003062 | 0.002576 | -0.004163 | -0.004810 | -0.004381 | -0.001186 | -0.002877 | -0.002178 | 0.041360 | 0.041839 | 0.041869 | -0.015176 | 0.000430 | 0.000102 | 0.001572 | -0.004981 | -0.005527 | -0.005794 | -0.096102 | 0.004982 | 0.005223 | 0.005025 | 0.000133 | -0.000277 | 0.001385 | -0.010062 | -0.008183 | -0.009991 | -0.003046 | -0.001079 | -0.002855 | -0.003996 | -0.002208 | -0.003830 | -0.001571 | -0.002280 | -0.001868 | 0.007266 | 0.007922 | 0.007665 | -0.001398 | -0.029311 | -0.002436 | -0.007124 | 0.000645 | -0.000485 | -0.028808 | -0.005218 | 0.025977 | -0.005347 | 0.026342 | -0.002822 | -0.001857 | 0.060464 | 0.075081 | 1.000000 | -0.027481 | -0.012143 | 0.062528 | 0.015713 | 0.878837 | -0.024273 | 0.278429 | -0.233456 | 0.174431 | -0.020803 | 0.030923 | 0.031620 | 0.010330 |
| DAYS_LAST_PHONE_CHANGE | 0.000776 | -0.002659 | -0.002478 | -0.000391 | 0.001123 | 0.001182 | 0.000882 | -0.002901 | -0.003382 | -0.003171 | -0.006971 | -0.007270 | -0.007243 | 0.011749 | 0.011615 | 0.011920 | 0.002689 | -0.000237 | 0.000591 | -0.000183 | -0.005732 | -0.006458 | -0.007090 | -0.130211 | -0.004054 | -0.004427 | -0.004454 | -0.011752 | -0.011418 | -0.010413 | -0.008792 | -0.007959 | -0.009078 | -0.012220 | -0.011140 | -0.012022 | -0.011096 | -0.009979 | -0.011248 | -0.006165 | -0.006622 | -0.006550 | 0.001918 | 0.003131 | 0.003533 | -0.008522 | -0.075542 | -0.002318 | -0.041114 | -0.000615 | 0.002352 | -0.113448 | -0.002055 | -0.015177 | 0.002201 | -0.014661 | 0.000701 | -0.195827 | -0.076893 | -0.064906 | -0.027481 | 1.000000 | -0.015647 | -0.074388 | -0.017254 | -0.006180 | -0.046043 | 0.083957 | 0.023129 | 0.056938 | 0.086779 | 0.026558 | 0.025939 | 0.054953 |
| HOUR_APPR_PROCESS_START | 0.000205 | 0.047662 | 0.046151 | 0.040003 | 0.014680 | 0.014174 | 0.012107 | 0.078353 | 0.079959 | 0.072238 | 0.113720 | 0.119442 | 0.118550 | -0.016409 | -0.014470 | -0.014282 | -0.069504 | 0.014274 | 0.014503 | 0.011613 | 0.034527 | 0.041399 | 0.040947 | 0.032487 | 0.044565 | 0.038635 | 0.043368 | 0.105367 | 0.106407 | 0.099335 | 0.083651 | 0.074785 | 0.082497 | 0.021172 | 0.016610 | 0.021492 | 0.084724 | 0.075350 | 0.083559 | 0.113995 | 0.119406 | 0.118014 | -0.011315 | -0.010619 | -0.010170 | 0.080780 | -0.040533 | -0.004517 | 0.036501 | -0.017674 | 0.000075 | -0.030689 | -0.000416 | -0.010677 | -0.009769 | -0.010689 | -0.006368 | 0.156600 | 0.062811 | 0.053074 | -0.012143 | -0.015647 | 1.000000 | 0.053257 | 0.033784 | -0.006909 | 0.171821 | 0.092099 | -0.090384 | -0.011111 | 0.032615 | -0.285609 | -0.265247 | -0.022945 |
| AMT_CREDIT | 0.000214 | 0.049198 | 0.048203 | 0.041446 | 0.013413 | 0.012401 | 0.010076 | 0.058731 | 0.060508 | 0.052481 | 0.074611 | 0.078129 | 0.077513 | 0.033075 | 0.034655 | 0.034931 | -0.096874 | 0.004690 | 0.005175 | 0.001402 | 0.033595 | 0.041226 | 0.039395 | 0.167599 | 0.040894 | 0.035297 | 0.038741 | 0.081052 | 0.082385 | 0.076927 | 0.063280 | 0.055799 | 0.061599 | 0.013505 | 0.009157 | 0.014622 | 0.073658 | 0.065696 | 0.072571 | 0.100857 | 0.103899 | 0.103290 | 0.005819 | 0.007028 | 0.006500 | 0.074399 | 0.043049 | -0.001802 | 0.054457 | -0.003724 | 0.004057 | -0.049236 | 0.015057 | 0.001722 | -0.022172 | 0.001677 | -0.019980 | 0.131146 | 0.986998 | 0.769449 | 0.062528 | -0.074388 | 0.053257 | 1.000000 | 0.143687 | 0.001776 | 0.101220 | -0.055576 | -0.066224 | 0.010353 | -0.006176 | -0.102672 | -0.111988 | -0.030187 |
| AMT_INCOME_TOTAL | -0.001795 | 0.086203 | 0.084201 | 0.072656 | 0.030406 | 0.028913 | 0.025624 | 0.105237 | 0.107432 | 0.092782 | 0.130492 | 0.139013 | 0.137605 | 0.038279 | 0.042482 | 0.042782 | -0.119654 | -0.002390 | -0.002143 | -0.004020 | 0.011618 | 0.015454 | 0.014711 | 0.023251 | 0.077089 | 0.064064 | 0.073302 | 0.039690 | 0.040491 | 0.036894 | 0.031310 | 0.027137 | 0.030678 | 0.004576 | 0.002139 | 0.005134 | 0.035924 | 0.031260 | 0.035275 | 0.052066 | 0.054379 | 0.053956 | 0.005204 | 0.005564 | 0.005571 | 0.037922 | -0.029240 | 0.001770 | 0.022868 | 0.000290 | 0.002500 | 0.010620 | 0.004531 | -0.012351 | -0.012178 | -0.012438 | -0.012462 | 0.054966 | 0.146114 | 0.175849 | 0.015713 | -0.017254 | 0.033784 | 0.143687 | 1.000000 | 0.012452 | 0.068597 | 0.025544 | -0.058891 | 0.025475 | 0.008070 | -0.078886 | -0.084670 | -0.002481 |
| CNT_CHILDREN | -0.000688 | -0.000503 | -0.000145 | -0.000906 | 0.004179 | 0.004442 | 0.004294 | -0.005822 | -0.006488 | -0.006230 | -0.009376 | -0.010143 | -0.009670 | 0.029196 | 0.029595 | 0.029646 | 0.009539 | -0.004147 | -0.004457 | -0.003953 | -0.009291 | -0.009050 | -0.009238 | -0.138459 | 0.003166 | 0.002664 | 0.003049 | -0.005835 | -0.006032 | -0.005723 | -0.012330 | -0.011344 | -0.012264 | -0.006975 | -0.005575 | -0.006867 | -0.009387 | -0.008615 | -0.009594 | -0.009677 | -0.009643 | -0.009383 | 0.006001 | 0.006926 | 0.006571 | -0.006763 | -0.043570 | -0.003201 | -0.009941 | -0.000417 | 0.000581 | -0.041786 | -0.008286 | 0.015323 | -0.003045 | 0.015670 | -0.001948 | -0.017545 | -0.002337 | 0.020850 | 0.878837 | -0.006180 | -0.006909 | 0.001776 | 0.012452 | 1.000000 | -0.025826 | 0.331623 | -0.240468 | 0.183940 | -0.028503 | 0.025528 | 0.024614 | 0.019552 |
| REGION_POPULATION_RELATIVE | 0.001271 | 0.168101 | 0.163327 | 0.134159 | 0.024268 | 0.021699 | 0.016331 | 0.190426 | 0.195956 | 0.164517 | 0.273877 | 0.292362 | 0.288614 | -0.064028 | -0.058163 | -0.057069 | -0.082891 | -0.053101 | -0.051987 | -0.061096 | 0.066314 | 0.098987 | 0.094199 | 0.098941 | 0.076143 | 0.051838 | 0.067917 | 0.275174 | 0.281380 | 0.252585 | 0.206390 | 0.175433 | 0.201838 | 0.033167 | 0.015755 | 0.036256 | 0.214648 | 0.182047 | 0.210470 | 0.303690 | 0.322096 | 0.317838 | -0.006707 | -0.006570 | -0.006542 | 0.203455 | -0.006362 | -0.003104 | 0.078099 | -0.003025 | 0.001361 | 0.002898 | -0.000677 | -0.010509 | 0.001552 | -0.010980 | 0.006005 | 0.198794 | 0.105018 | 0.119916 | -0.024273 | -0.046043 | 0.171821 | 0.101220 | 0.068597 | -0.025826 | 1.000000 | -0.029078 | -0.003825 | -0.052062 | -0.003950 | -0.532986 | -0.531728 | -0.037004 |
| DAYS_BIRTH | -0.000841 | 0.006585 | 0.007296 | 0.007584 | 0.000849 | 0.000777 | 0.001163 | 0.013687 | 0.013299 | 0.013336 | 0.000420 | 0.001133 | 0.001302 | 0.025823 | 0.027171 | 0.026899 | 0.007699 | 0.004539 | 0.004210 | 0.004763 | -0.002691 | -0.002384 | -0.002358 | -0.598890 | 0.004914 | 0.004090 | 0.005578 | -0.000223 | -0.000371 | -0.000107 | 0.006776 | 0.006669 | 0.006985 | -0.008534 | -0.008220 | -0.008986 | 0.001366 | 0.001559 | 0.001903 | 0.001685 | 0.002227 | 0.002280 | 0.001740 | 0.002015 | 0.002051 | 0.002688 | -0.206463 | -0.000823 | 0.003435 | 0.003899 | 0.002007 | -0.072728 | -0.011702 | 0.006292 | 0.001259 | 0.006664 | -0.000538 | -0.091607 | -0.053663 | 0.008731 | 0.278429 | 0.083957 | 0.092099 | -0.055576 | 0.025544 | 0.331623 | -0.029078 | 1.000000 | -0.615504 | 0.331472 | 0.272287 | 0.008738 | 0.007549 | 0.078418 |
| DAYS_EMPLOYED | 0.001274 | -0.008967 | -0.009276 | -0.009378 | -0.002721 | -0.002782 | -0.003421 | -0.020043 | -0.020296 | -0.019826 | -0.013644 | -0.014006 | -0.014512 | -0.006851 | -0.007974 | -0.007603 | 0.028075 | -0.011408 | -0.011420 | -0.010425 | -0.000176 | -0.001224 | -0.001120 | 0.289068 | -0.014019 | -0.012948 | -0.014081 | -0.008678 | -0.008651 | -0.008199 | -0.017006 | -0.015507 | -0.016718 | 0.002773 | 0.003498 | 0.002734 | -0.012905 | -0.011724 | -0.013176 | -0.014106 | -0.014993 | -0.015050 | 0.008376 | 0.008846 | 0.008620 | -0.014987 | 0.114225 | 0.002864 | -0.035039 | -0.003969 | 0.001232 | 0.049800 | 0.014332 | 0.006044 | 0.014949 | 0.005798 | 0.017882 | -0.019670 | -0.064092 | -0.103850 | -0.233456 | 0.023129 | -0.090384 | -0.066224 | -0.058891 | -0.240468 | -0.003825 | -0.615504 | 1.000000 | -0.210273 | -0.272791 | 0.032585 | 0.034407 | -0.045064 |
| DAYS_REGISTRATION | -0.000630 | 0.024592 | 0.025303 | 0.025497 | 0.035364 | 0.034240 | 0.032723 | 0.025284 | 0.024839 | 0.023973 | 0.019499 | 0.020757 | 0.020821 | 0.163429 | 0.164861 | 0.165196 | -0.025165 | 0.003442 | 0.003438 | 0.004006 | -0.018812 | -0.020079 | -0.020656 | -0.178719 | 0.052079 | 0.049933 | 0.052898 | 0.000790 | -0.000080 | 0.001957 | 0.013472 | 0.013142 | 0.013577 | -0.062268 | -0.059319 | -0.062525 | 0.007223 | 0.008007 | 0.007687 | 0.049158 | 0.049425 | 0.049661 | 0.010382 | 0.012817 | 0.012831 | 0.019829 | -0.106684 | -0.001097 | -0.010973 | -0.001868 | -0.000931 | -0.025366 | -0.000095 | 0.009425 | 0.004320 | 0.009426 | 0.002464 | -0.058838 | 0.012095 | 0.038813 | 0.174431 | 0.056938 | -0.011111 | 0.010353 | 0.025475 | 0.183940 | -0.052062 | 0.331472 | -0.210273 | 1.000000 | 0.101934 | 0.079297 | 0.072988 | 0.040217 |
| DAYS_ID_PUBLISH | -0.000887 | -0.000485 | -0.000236 | -0.000491 | -0.008094 | -0.007466 | -0.007737 | 0.000204 | 0.000710 | 0.000049 | -0.009859 | -0.009386 | -0.009253 | -0.009393 | -0.009253 | -0.009454 | 0.008747 | -0.005515 | -0.005355 | -0.005961 | -0.011839 | -0.012849 | -0.013046 | -0.132527 | 0.001327 | 0.000026 | 0.001627 | -0.010731 | -0.010767 | -0.010496 | -0.006499 | -0.006220 | -0.006452 | -0.013221 | -0.012944 | -0.013075 | -0.010633 | -0.010999 | -0.010498 | -0.011584 | -0.011297 | -0.011444 | -0.001100 | -0.002182 | -0.001754 | -0.010000 | -0.131930 | -0.002042 | -0.008832 | 0.004427 | -0.002177 | -0.034662 | -0.007338 | -0.012644 | 0.004500 | -0.012238 | 0.002850 | -0.050631 | -0.008840 | 0.011894 | -0.020803 | 0.086779 | 0.032615 | -0.006176 | 0.008070 | -0.028503 | -0.003950 | 0.272287 | -0.272791 | 0.101934 | 1.000000 | -0.005385 | -0.008018 | 0.051695 |
| REGION_RATING_CLIENT | -0.001853 | -0.120701 | -0.117366 | -0.095498 | -0.018347 | -0.015891 | -0.010272 | -0.152176 | -0.156766 | -0.129571 | -0.215123 | -0.229994 | -0.227258 | 0.048298 | 0.043189 | 0.042167 | 0.086297 | 0.046965 | 0.045123 | 0.058796 | -0.032146 | -0.061396 | -0.057318 | -0.113677 | -0.082002 | -0.059503 | -0.075075 | -0.221633 | -0.227037 | -0.201538 | -0.152610 | -0.123397 | -0.148410 | -0.021531 | -0.004438 | -0.023626 | -0.164884 | -0.133665 | -0.161042 | -0.219861 | -0.235021 | -0.231637 | 0.004547 | 0.004508 | 0.004512 | -0.161519 | -0.012732 | 0.003056 | -0.069076 | 0.006634 | -0.001510 | 0.010981 | 0.005321 | 0.034230 | 0.017643 | 0.034598 | 0.015480 | -0.291729 | -0.104647 | -0.129451 | 0.030923 | 0.026558 | -0.285609 | -0.102672 | -0.078886 | 0.025528 | -0.532986 | 0.008738 | 0.032585 | 0.079297 | -0.005385 | 1.000000 | 0.950316 | 0.058141 |
| REGION_RATING_CLIENT_W_CITY | -0.001741 | -0.130876 | -0.127754 | -0.107276 | -0.021329 | -0.019139 | -0.014123 | -0.176999 | -0.181184 | -0.155851 | -0.222929 | -0.236985 | -0.234200 | 0.040781 | 0.036414 | 0.035435 | 0.087654 | 0.037945 | 0.036342 | 0.048524 | -0.046738 | -0.074168 | -0.070167 | -0.113373 | -0.082463 | -0.060918 | -0.075988 | -0.233193 | -0.238425 | -0.213864 | -0.172048 | -0.143996 | -0.167831 | -0.028446 | -0.012132 | -0.030790 | -0.183204 | -0.153244 | -0.179341 | -0.237230 | -0.251429 | -0.248063 | -0.000838 | -0.000733 | -0.000644 | -0.178946 | -0.012105 | 0.002039 | -0.067108 | 0.006760 | -0.001322 | 0.010322 | 0.004850 | 0.029777 | 0.016739 | 0.030115 | 0.014089 | -0.287190 | -0.113207 | -0.143008 | 0.031620 | 0.025939 | -0.265247 | -0.111988 | -0.084670 | 0.024614 | -0.531728 | 0.007549 | 0.034407 | 0.072988 | -0.008018 | 0.950316 | 1.000000 | 0.059963 |
| TARGET | -0.000581 | -0.021858 | -0.021818 | -0.019588 | -0.003702 | -0.002904 | -0.001785 | -0.025916 | -0.026580 | -0.024955 | -0.033119 | -0.033705 | -0.033636 | -0.025586 | -0.025933 | -0.025685 | 0.039531 | -0.013984 | -0.013539 | -0.012519 | -0.021323 | -0.023834 | -0.023122 | -0.155781 | -0.012034 | -0.010751 | -0.011442 | -0.035791 | -0.036381 | -0.034306 | -0.031644 | -0.029427 | -0.031137 | -0.020116 | -0.018407 | -0.020484 | -0.035242 | -0.032972 | -0.034857 | -0.045368 | -0.046041 | -0.045861 | -0.009553 | -0.010557 | -0.010934 | -0.035540 | -0.180865 | -0.001428 | -0.012376 | -0.000547 | 0.000813 | 0.018896 | -0.002230 | 0.009144 | 0.029870 | 0.009272 | 0.031837 | -0.159698 | -0.039304 | -0.012715 | 0.010330 | 0.054953 | -0.022945 | -0.030187 | -0.002481 | 0.019552 | -0.037004 | 0.078418 | -0.045064 | 0.040217 | 0.051695 | 0.058141 | 0.059963 | 1.000000 |
f_aux.get_corr_matrix(dataset = df_loan_train[list_var_continuous],
metodo='pearson', size_figure=[10,8])
0
De las correlaciones observadas me gustaría destacar dos de ellas:
Observamos como AMT_CREDIT y AMT_ANNUITY tienen una correlación positiva del 77%, es decir, si aumenta la cantidad de dinero prestado al cliente, aumenta la anualidad de la solicitud anterior.
AMT_CREDIT Y AMT_GOOD_PRICES presentan una correlación lineal positiva del 99%, es decir, cuanto mayor es cantidad prestada al cliente, mayor es el valor de sus bienes para los que se le ha concedido el préstamo. Esto es algo lógico.
Además de estas dos correlaciones, la variable 'TARGET' no está altamente correlacionada y no hay variables que expliquen el comportamiento de nuestra variable objetivo.
corr.loc['TARGET'].sort_values(ascending=False)
TARGET 1.000000 DAYS_BIRTH 0.078418 REGION_RATING_CLIENT_W_CITY 0.059963 REGION_RATING_CLIENT 0.058141 DAYS_LAST_PHONE_CHANGE 0.054953 DAYS_ID_PUBLISH 0.051695 DAYS_REGISTRATION 0.040217 OWN_CAR_AGE 0.039531 DEF_30_CNT_SOCIAL_CIRCLE 0.031837 DEF_60_CNT_SOCIAL_CIRCLE 0.029870 CNT_CHILDREN 0.019552 AMT_REQ_CREDIT_BUREAU_YEAR 0.018896 CNT_FAM_MEMBERS 0.010330 OBS_30_CNT_SOCIAL_CIRCLE 0.009272 OBS_60_CNT_SOCIAL_CIRCLE 0.009144 AMT_REQ_CREDIT_BUREAU_DAY 0.000813 AMT_REQ_CREDIT_BUREAU_HOUR -0.000547 SK_ID_CURR -0.000581 AMT_REQ_CREDIT_BUREAU_WEEK -0.001428 NONLIVINGAPARTMENTS_MODE -0.001785 AMT_REQ_CREDIT_BUREAU_QRT -0.002230 AMT_INCOME_TOTAL -0.002481 NONLIVINGAPARTMENTS_MEDI -0.002904 NONLIVINGAPARTMENTS_AVG -0.003702 YEARS_BEGINEXPLUATATION_MODE -0.009553 YEARS_BEGINEXPLUATATION_AVG -0.010557 NONLIVINGAREA_MODE -0.010751 YEARS_BEGINEXPLUATATION_MEDI -0.010934 NONLIVINGAREA_MEDI -0.011442 NONLIVINGAREA_AVG -0.012034 AMT_REQ_CREDIT_BUREAU_MON -0.012376 LANDAREA_MODE -0.012519 AMT_ANNUITY -0.012715 LANDAREA_AVG -0.013539 LANDAREA_MEDI -0.013984 ENTRANCES_MODE -0.018407 COMMONAREA_MODE -0.019588 ENTRANCES_MEDI -0.020116 ENTRANCES_AVG -0.020484 BASEMENTAREA_MODE -0.021323 COMMONAREA_MEDI -0.021818 COMMONAREA_AVG -0.021858 HOUR_APPR_PROCESS_START -0.022945 BASEMENTAREA_MEDI -0.023122 BASEMENTAREA_AVG -0.023834 LIVINGAPARTMENTS_MODE -0.024955 YEARS_BUILD_MODE -0.025586 YEARS_BUILD_AVG -0.025685 LIVINGAPARTMENTS_MEDI -0.025916 YEARS_BUILD_MEDI -0.025933 LIVINGAPARTMENTS_AVG -0.026580 APARTMENTS_MODE -0.029427 AMT_CREDIT -0.030187 APARTMENTS_MEDI -0.031137 APARTMENTS_AVG -0.031644 LIVINGAREA_MODE -0.032972 FLOORSMIN_MODE -0.033119 FLOORSMIN_MEDI -0.033636 FLOORSMIN_AVG -0.033705 ELEVATORS_MODE -0.034306 LIVINGAREA_MEDI -0.034857 LIVINGAREA_AVG -0.035242 TOTALAREA_MODE -0.035540 ELEVATORS_MEDI -0.035791 ELEVATORS_AVG -0.036381 REGION_POPULATION_RELATIVE -0.037004 AMT_GOODS_PRICE -0.039304 DAYS_EMPLOYED -0.045064 FLOORSMAX_MODE -0.045368 FLOORSMAX_MEDI -0.045861 FLOORSMAX_AVG -0.046041 EXT_SOURCE_1 -0.155781 EXT_SOURCE_2 -0.159698 EXT_SOURCE_3 -0.180865 Name: TARGET, dtype: float64
Ninguna variable explica de una manera muy grande a la variable Target, algo que parece normal en un problema tan complejo como es la detección de dificultad en pago de préstamos.
Tratamiento de valores nulos¶
El tratamiento de valores nulos depende del contexto en el que estemos trabajando, la naturaleza de los datos y el impacto que los valores ausentes pueden tener en tu análisis o modelo de machine learning. En general hay varias opciones a la hora de imputar nuestros valores nulos:
Imputar los valores numéricos mediante la media si nuestras variables siguen una distribución normal o mediante la mediana cuando presenten valores atípicos. Imputar un valor fijo o predeterminado, o utilizar un algoritmo de imputación avanzada (KNN) que predice los valores ausentes en función de los valores de otras columnas.
Imputar los valores categóricos mediante la moda cuando las variables presentan valores dominantes, asignar un valor fijo como pudiera ser 'Desconocido'.
En mi caso, al no tener mucho contexto de las variables, decidiré imputar los valores nulos de las variables categóricas por un valor fijo 'Desconocido' ya que realmente no conocemos la naturaleza de esos valores nulos. Prefiero no imputar por moda, ya que en algunas variables categóricas realmente no observamos un valor predominante sobre los demás, por lo que podríamos distorsionar la distribución de dichas variables.
En el caso de las numéricas, optaré por imputar la mediana ya que la mayoría de las variables numéricas no siguen una distribución normal y a pesar de no presentar un gran porcentaje de valores atípicos la mediana no se ve afectada por valores extremos, a diferencia de la media. Además, los modelos de machine learning suelen ser sensibles a valores extremos. Usar la mediana reduce la posibilidad de que los valores imputados introduzcan ruido o sesgo no deseado.
En el caso de las variables booleanas, variables que toman el valor 0 o 1, si que optaré por imputar su moda, ya que no tiene sentido imputar por su mediana si verdaderamente su distribución toman dos únicos valores.
list_cat_vars, other = f_aux.dame_variables_categoricas(dataset=df_loan_train)
# Nos aseguramos de que las columnas categóricas permitan la categoría 'Desconocido'
for col in list_cat_vars:
if pd.api.types.is_categorical_dtype(df_loan_train[col]):
# Agregar 'Desconocido' como categoría si no existe
df_loan_train[col] = df_loan_train[col].cat.add_categories(['Desconocido'])
# Imputar valores nulos con 'Desconocido'
df_loan_train[list_cat_vars] = df_loan_train[list_cat_vars].fillna(value='Desconocido')
df_loan_train[list_cat_vars]
| FONDKAPREMONT_MODE | WALLSMATERIAL_MODE | HOUSETYPE_MODE | EMERGENCYSTATE_MODE | OCCUPATION_TYPE | NAME_TYPE_SUITE | ORGANIZATION_TYPE | NAME_CONTRACT_TYPE | FLAG_OWN_CAR | CODE_GENDER | NAME_INCOME_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | NAME_EDUCATION_TYPE | FLAG_OWN_REALTY | WEEKDAY_APPR_PROCESS_START | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 238851 | Desconocido | Desconocido | Desconocido | Desconocido | Laborers | Unaccompanied | Business Entity Type 1 | Revolving loans | N | M | Working | Single / not married | House / apartment | Secondary / secondary special | N | MONDAY |
| 181603 | Desconocido | Stone, brick | block of flats | No | Sales staff | Unaccompanied | Self-employed | Cash loans | Y | F | Working | Married | House / apartment | Secondary / secondary special | N | WEDNESDAY |
| 63661 | reg oper account | Stone, brick | block of flats | No | Sales staff | Family | Business Entity Type 3 | Cash loans | N | F | Commercial associate | Married | House / apartment | Secondary / secondary special | Y | FRIDAY |
| 122457 | reg oper account | Stone, brick | block of flats | No | Sales staff | Unaccompanied | Industry: type 6 | Cash loans | N | F | Working | Civil marriage | House / apartment | Secondary / secondary special | Y | MONDAY |
| 70875 | Desconocido | Block | block of flats | No | Drivers | Unaccompanied | Other | Cash loans | Y | M | Commercial associate | Married | House / apartment | Secondary / secondary special | Y | TUESDAY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 216116 | reg oper account | Stone, brick | block of flats | No | Laborers | Unaccompanied | Self-employed | Cash loans | Y | M | Working | Married | House / apartment | Secondary / secondary special | N | FRIDAY |
| 168796 | Desconocido | Stone, brick | block of flats | No | Sales staff | Family | Business Entity Type 3 | Cash loans | N | F | Commercial associate | Married | House / apartment | Higher education | Y | THURSDAY |
| 241375 | Desconocido | Desconocido | Desconocido | Desconocido | Core staff | Unaccompanied | Business Entity Type 1 | Cash loans | N | F | Commercial associate | Single / not married | House / apartment | Secondary / secondary special | N | SATURDAY |
| 297753 | reg oper account | Panel | block of flats | No | Waiters/barmen staff | Unaccompanied | Self-employed | Cash loans | N | F | Working | Married | House / apartment | Incomplete higher | N | TUESDAY |
| 108462 | Desconocido | Desconocido | Desconocido | Desconocido | Laborers | Unaccompanied | Business Entity Type 2 | Cash loans | N | F | Working | Married | House / apartment | Secondary / secondary special | Y | THURSDAY |
246008 rows × 16 columns
No observamos valores nulos en nuestras columnas que presentan valores booleanos, aunque si tuvieramos presencia de ellos y nos surgiera la necesidad de imputar la moda en lugar de esos valores, podríamos utilizar el bucle descrito en el siguiente código.
df_loan_train[df_loan_bool].isnull().sum()
# for col in df_loan_train.select_dtypes(include=['bool']).columns:
# Calcular la moda de la columna
# moda = df_loan_train[col].mode()[0]
# Sustituir los valores nulos con la moda
# df_loan_train[col] = df_loan_train[col].fillna(moda)
REG_REGION_NOT_LIVE_REGION 0 FLAG_MOBIL 0 FLAG_EMP_PHONE 0 FLAG_WORK_PHONE 0 FLAG_CONT_MOBILE 0 TARGET 0 LIVE_REGION_NOT_WORK_REGION 0 FLAG_EMAIL 0 FLAG_PHONE 0 REG_CITY_NOT_LIVE_CITY 0 REG_CITY_NOT_WORK_CITY 0 LIVE_CITY_NOT_WORK_CITY 0 REG_REGION_NOT_WORK_REGION 0 FLAG_DOCUMENT_4 0 FLAG_DOCUMENT_5 0 FLAG_DOCUMENT_2 0 FLAG_DOCUMENT_3 0 FLAG_DOCUMENT_11 0 FLAG_DOCUMENT_10 0 FLAG_DOCUMENT_9 0 FLAG_DOCUMENT_8 0 FLAG_DOCUMENT_7 0 FLAG_DOCUMENT_6 0 FLAG_DOCUMENT_12 0 FLAG_DOCUMENT_13 0 FLAG_DOCUMENT_19 0 FLAG_DOCUMENT_18 0 FLAG_DOCUMENT_17 0 FLAG_DOCUMENT_16 0 FLAG_DOCUMENT_15 0 FLAG_DOCUMENT_14 0 FLAG_DOCUMENT_20 0 FLAG_DOCUMENT_21 0 dtype: int64
# Imputar valores nulos en columnas numéricas con la mediana
for col in df_loan_train.select_dtypes(include=['number']).columns:
# Calcular la mediana de la columna
mediana = df_loan_train[col].median()
# Sustituir los valores nulos con la mediana
df_loan_train[col] = df_loan_train[col].fillna(mediana)
df_loan_train[df_loan_num].head(10)
| SK_ID_CURR | COMMONAREA_AVG | COMMONAREA_MEDI | COMMONAREA_MODE | NONLIVINGAPARTMENTS_AVG | NONLIVINGAPARTMENTS_MEDI | NONLIVINGAPARTMENTS_MODE | LIVINGAPARTMENTS_MEDI | LIVINGAPARTMENTS_AVG | LIVINGAPARTMENTS_MODE | FLOORSMIN_AVG | YEARS_BUILD_MODE | YEARS_BUILD_MEDI | YEARS_BUILD_AVG | OWN_CAR_AGE | LANDAREA_MEDI | LANDAREA_AVG | LANDAREA_MODE | BASEMENTAREA_MODE | BASEMENTAREA_AVG | BASEMENTAREA_MEDI | EXT_SOURCE_1 | NONLIVINGAREA_AVG | NONLIVINGAREA_MODE | NONLIVINGAREA_MEDI | ELEVATORS_AVG | APARTMENTS_AVG | APARTMENTS_MODE | APARTMENTS_MEDI | ENTRANCES_AVG | LIVINGAREA_AVG | LIVINGAREA_MODE | LIVINGAREA_MEDI | FLOORSMAX_AVG | FLOORSMAX_MEDI | YEARS_BEGINEXPLUATATION_MODE | YEARS_BEGINEXPLUATATION_AVG | YEARS_BEGINEXPLUATATION_MEDI | TOTALAREA_MODE | EXT_SOURCE_3 | EXT_SOURCE_2 | AMT_GOODS_PRICE | AMT_ANNUITY | DAYS_LAST_PHONE_CHANGE | ORGANIZATION_TYPE | AMT_CREDIT | AMT_INCOME_TOTAL | REGION_POPULATION_RELATIVE | DAYS_BIRTH | DAYS_EMPLOYED | DAYS_REGISTRATION | DAYS_ID_PUBLISH | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 238851 | 376683 | 0.0211 | 0.0209 | 0.0191 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.2083 | 0.7648 | 0.7585 | 0.7552 | 9.0 | 0.0488 | 0.0483 | 0.0459 | 0.0747 | 0.0764 | 0.0759 | 0.505819 | 0.0036 | 0.0011 | 0.0030 | 0.00 | 0.0876 | 0.0840 | 0.0874 | 0.1379 | 0.0745 | 0.0731 | 0.0749 | 0.1667 | 0.1667 | 0.9816 | 0.9816 | 0.9816 | 0.0688 | 0.204423 | 0.409389 | 180000.0 | 9000.0 | -1237.0 | Business Entity Type 1 | 180000.0 | 135000.0 | 0.008474 | -9935 | -869 | -3440.0 | -2546 |
| 181603 | 310487 | 0.0211 | 0.0209 | 0.0191 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.2083 | 0.7648 | 0.7585 | 0.7552 | 2.0 | 0.0488 | 0.0483 | 0.0459 | 0.0747 | 0.0764 | 0.0759 | 0.505819 | 0.0036 | 0.0011 | 0.0030 | 0.00 | 0.0814 | 0.0830 | 0.0822 | 0.2069 | 0.0919 | 0.0957 | 0.0935 | 0.1667 | 0.1667 | 0.9926 | 0.9925 | 0.9925 | 0.0723 | 0.722393 | 0.728032 | 679500.0 | 28404.0 | -1692.0 | Self-employed | 787131.0 | 135000.0 | 0.010147 | -10078 | -1289 | -608.0 | -1233 |
| 63661 | 173827 | 0.0116 | 0.0116 | 0.0117 | 0.0000 | 0.0000 | 0.0000 | 0.0599 | 0.0588 | 0.0643 | 0.2083 | 0.7060 | 0.6981 | 0.6940 | 9.0 | 0.0644 | 0.0633 | 0.0647 | 0.0552 | 0.0532 | 0.0532 | 0.233131 | 0.0000 | 0.0000 | 0.0000 | 0.00 | 0.0722 | 0.0735 | 0.0729 | 0.1724 | 0.0525 | 0.0547 | 0.0535 | 0.1667 | 0.1667 | 0.9777 | 0.9776 | 0.9776 | 0.0568 | 0.535276 | 0.392192 | 229500.0 | 27454.5 | -777.0 | Business Entity Type 3 | 253737.0 | 189000.0 | 0.046220 | -9425 | -435 | -4201.0 | -98 |
| 122457 | 241978 | 0.0064 | 0.0065 | 0.0065 | 0.0039 | 0.0039 | 0.0039 | 0.0547 | 0.0538 | 0.0588 | 0.2083 | 0.6864 | 0.6780 | 0.6736 | 9.0 | 0.0333 | 0.0328 | 0.0335 | 0.0845 | 0.0815 | 0.0815 | 0.889098 | 0.0509 | 0.0539 | 0.0520 | 0.00 | 0.0670 | 0.0683 | 0.0677 | 0.1379 | 0.0508 | 0.0529 | 0.0517 | 0.1667 | 0.1667 | 0.9762 | 0.9762 | 0.9762 | 0.0546 | 0.481249 | 0.568924 | 477000.0 | 17775.0 | -2639.0 | Industry: type 6 | 552555.0 | 90000.0 | 0.031329 | -20494 | -2304 | -10741.0 | -4051 |
| 70875 | 182206 | 0.0211 | 0.0209 | 0.0191 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.2083 | 0.7648 | 0.7585 | 0.7552 | 5.0 | 0.0488 | 0.0483 | 0.0459 | 0.0747 | 0.0764 | 0.0759 | 0.505819 | 0.0000 | 0.0000 | 0.0000 | 0.00 | 0.0753 | 0.0588 | 0.0760 | 0.1724 | 0.0745 | 0.0731 | 0.0749 | 0.1667 | 0.1667 | 0.9836 | 0.9836 | 0.9836 | 0.0408 | 0.275000 | 0.294987 | 477000.0 | 40797.0 | -129.0 | Other | 558855.0 | 225000.0 | 0.031329 | -14160 | -289 | -6104.0 | -4666 |
| 233090 | 369981 | 0.0211 | 0.0209 | 0.0191 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.2083 | 0.7648 | 0.7585 | 0.7552 | 9.0 | 0.0488 | 0.0483 | 0.0459 | 0.0747 | 0.0764 | 0.0759 | 0.505819 | 0.0036 | 0.0011 | 0.0030 | 0.00 | 0.0876 | 0.0840 | 0.0874 | 0.1379 | 0.0977 | 0.1018 | 0.0995 | 0.1667 | 0.1667 | 0.9901 | 0.9901 | 0.9901 | 0.0768 | 0.616122 | 0.285898 | 229500.0 | 25227.0 | -2148.0 | Security | 253737.0 | 67500.0 | 0.008068 | -19518 | -1189 | -1167.0 | -2764 |
| 148840 | 272567 | 0.0211 | 0.0209 | 0.0191 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.2083 | 0.7648 | 0.7585 | 0.7552 | 9.0 | 0.0488 | 0.0483 | 0.0459 | 0.0747 | 0.0764 | 0.0759 | 0.505819 | 0.0036 | 0.0011 | 0.0030 | 0.00 | 0.0876 | 0.0840 | 0.0874 | 0.1379 | 0.0745 | 0.0731 | 0.0749 | 0.1667 | 0.1667 | 0.9816 | 0.9816 | 0.9816 | 0.0688 | 0.586740 | 0.654621 | 360000.0 | 15790.5 | -721.0 | XNA | 436032.0 | 90000.0 | 0.003122 | -23128 | 365243 | -7790.0 | -748 |
| 176528 | 304561 | 0.0000 | 0.0000 | 0.0000 | 0.0039 | 0.0039 | 0.0039 | 0.0676 | 0.0664 | 0.0725 | 0.2083 | 0.6929 | 0.6847 | 0.6804 | 9.0 | 0.0709 | 0.0697 | 0.0713 | 0.0699 | 0.0673 | 0.0673 | 0.505819 | 0.0029 | 0.0030 | 0.0029 | 0.00 | 0.0825 | 0.0840 | 0.0833 | 0.1379 | 0.0692 | 0.0721 | 0.0705 | 0.1667 | 0.1667 | 0.9767 | 0.9767 | 0.9767 | 0.0723 | 0.691021 | 0.673752 | 1350000.0 | 47443.5 | 0.0 | XNA | 1506816.0 | 135000.0 | 0.018801 | -20207 | 365243 | -9765.0 | -3675 |
| 201528 | 333611 | 0.0211 | 0.0209 | 0.0191 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.2083 | 0.7648 | 0.7585 | 0.7552 | 9.0 | 0.0488 | 0.0483 | 0.0459 | 0.0747 | 0.0764 | 0.0759 | 0.505819 | 0.0036 | 0.0011 | 0.0030 | 0.00 | 0.0876 | 0.0840 | 0.0874 | 0.1379 | 0.0745 | 0.0731 | 0.0749 | 0.1667 | 0.1667 | 0.9816 | 0.9816 | 0.9816 | 0.0688 | 0.586740 | 0.636360 | 166500.0 | 7686.0 | -929.0 | School | 210456.0 | 135000.0 | 0.018029 | -18133 | -2278 | -8441.0 | -1678 |
| 213939 | 347914 | 0.0332 | 0.0334 | 0.0335 | 0.0000 | 0.0000 | 0.0000 | 0.0761 | 0.0756 | 0.0771 | 0.0417 | 0.8236 | 0.8189 | 0.8164 | 9.0 | 0.0196 | 0.0193 | 0.0198 | 0.0358 | 0.0345 | 0.0345 | 0.505819 | 0.1120 | 0.1186 | 0.1144 | 0.04 | 0.0557 | 0.0567 | 0.0562 | 0.0345 | 0.0369 | 0.0385 | 0.0376 | 0.3333 | 0.3333 | 0.9866 | 0.9866 | 0.9866 | 0.0472 | 0.379100 | 0.003770 | 450000.0 | 31261.5 | -550.0 | Self-employed | 640080.0 | 112500.0 | 0.008230 | -20770 | -1077 | -8671.0 | -3281 |
f_aux.get_percent_null_values_target(df_loan_train, [i for i in list_var_continuous], target='TARGET')
No existen variables con valores nulos
Nos aseguramos que todas las imputaciones de valores nulos se han realizado de manera exitosa.
Matriz de correlación para variables categóricas: Cramers V matrix¶
Debido a que no podemos ver la correlación de las variables categóricas con el estadístico de Pearson, vamos a acercarnos lo máximo posible con el estadístico de V Cramers. Podremos observar la correlación de nuestras variables categóricas.
Si bien aunque nuestras variables booleanas que toman valores de 0 o 1 son numéricas, su verdadero origen e interpretación es categórica, ya que si toma valor de 0 pertenece a una categoría distinta de si tomara valor de 1. Por tanto, trataremos a estas como tal y realizaremos su correlación según la V de Cramers.
df_cat_bool = pd.concat([df_loan_train[df_loan_cat], df_loan_train[df_loan_bool]], axis=1)
df_cat_bool.columns.values
array(['FONDKAPREMONT_MODE', 'FLOORSMIN_MODE', 'FLOORSMIN_MEDI',
'ELEVATORS_MEDI', 'ELEVATORS_MODE', 'WALLSMATERIAL_MODE',
'ENTRANCES_MEDI', 'ENTRANCES_MODE', 'HOUSETYPE_MODE',
'FLOORSMAX_MODE', 'EMERGENCYSTATE_MODE', 'OCCUPATION_TYPE',
'AMT_REQ_CREDIT_BUREAU_WEEK', 'AMT_REQ_CREDIT_BUREAU_MON',
'AMT_REQ_CREDIT_BUREAU_HOUR', 'AMT_REQ_CREDIT_BUREAU_DAY',
'AMT_REQ_CREDIT_BUREAU_YEAR', 'AMT_REQ_CREDIT_BUREAU_QRT',
'NAME_TYPE_SUITE', 'OBS_60_CNT_SOCIAL_CIRCLE',
'DEF_60_CNT_SOCIAL_CIRCLE', 'OBS_30_CNT_SOCIAL_CIRCLE',
'DEF_30_CNT_SOCIAL_CIRCLE', 'CNT_FAM_MEMBERS',
'HOUR_APPR_PROCESS_START', 'NAME_CONTRACT_TYPE', 'FLAG_OWN_CAR',
'CODE_GENDER', 'CNT_CHILDREN', 'NAME_INCOME_TYPE',
'NAME_FAMILY_STATUS', 'NAME_HOUSING_TYPE', 'NAME_EDUCATION_TYPE',
'FLAG_OWN_REALTY', 'REGION_RATING_CLIENT',
'REGION_RATING_CLIENT_W_CITY', 'WEEKDAY_APPR_PROCESS_START',
'REG_REGION_NOT_LIVE_REGION', 'FLAG_MOBIL', 'FLAG_EMP_PHONE',
'FLAG_WORK_PHONE', 'FLAG_CONT_MOBILE', 'TARGET',
'LIVE_REGION_NOT_WORK_REGION', 'FLAG_EMAIL', 'FLAG_PHONE',
'REG_CITY_NOT_LIVE_CITY', 'REG_CITY_NOT_WORK_CITY',
'LIVE_CITY_NOT_WORK_CITY', 'REG_REGION_NOT_WORK_REGION',
'FLAG_DOCUMENT_4', 'FLAG_DOCUMENT_5', 'FLAG_DOCUMENT_2',
'FLAG_DOCUMENT_3', 'FLAG_DOCUMENT_11', 'FLAG_DOCUMENT_10',
'FLAG_DOCUMENT_9', 'FLAG_DOCUMENT_8', 'FLAG_DOCUMENT_7',
'FLAG_DOCUMENT_6', 'FLAG_DOCUMENT_12', 'FLAG_DOCUMENT_13',
'FLAG_DOCUMENT_19', 'FLAG_DOCUMENT_18', 'FLAG_DOCUMENT_17',
'FLAG_DOCUMENT_16', 'FLAG_DOCUMENT_15', 'FLAG_DOCUMENT_14',
'FLAG_DOCUMENT_20', 'FLAG_DOCUMENT_21'], dtype=object)
confusion_matrix = pd.crosstab(df_loan_train["TARGET"], df_loan_train["NAME_CONTRACT_TYPE"])
print(confusion_matrix)
f_aux.cramers_v(confusion_matrix.values)
NAME_CONTRACT_TYPE Cash loans Revolving loans TARGET 0 204044 22104 1 18586 1274
np.float64(0.031114763938304826)
confusion_matrix = pd.crosstab(df_loan_train["TARGET"], df_loan_train["TARGET"])
f_aux.cramers_v(confusion_matrix.values)
np.float64(0.9999726127135284)
corr_cats = f_aux.corr_cat(df=df_cat_bool, target='TARGET' ,target_transform=True)
corr_cats
| FONDKAPREMONT_MODE | WALLSMATERIAL_MODE | HOUSETYPE_MODE | EMERGENCYSTATE_MODE | OCCUPATION_TYPE | NAME_TYPE_SUITE | NAME_CONTRACT_TYPE | FLAG_OWN_CAR | CODE_GENDER | NAME_INCOME_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | NAME_EDUCATION_TYPE | FLAG_OWN_REALTY | WEEKDAY_APPR_PROCESS_START | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FONDKAPREMONT_MODE | 1.000000 | 0.350329 | 0.395533 | 0.461823 | 0.031637 | 0.016696 | 0.023575 | 0.015057 | 0.012622 | 0.028740 | 0.024369 | 0.031720 | 0.044419 | 0.017970 | 0.005678 | 0.031948 |
| WALLSMATERIAL_MODE | 0.350329 | 1.000000 | 0.559573 | 0.690289 | 0.032516 | 0.014038 | 0.029114 | 0.034645 | 0.020289 | 0.031086 | 0.033476 | 0.043840 | 0.063345 | 0.029822 | 0.003958 | 0.044376 |
| HOUSETYPE_MODE | 0.395533 | 0.559573 | 1.000000 | 0.669327 | 0.046282 | 0.019799 | 0.028316 | 0.033259 | 0.019805 | 0.043217 | 0.043550 | 0.045783 | 0.068319 | 0.023046 | 0.002006 | 0.040940 |
| EMERGENCYSTATE_MODE | 0.461823 | 0.690289 | 0.669327 | 1.000000 | 0.057169 | 0.025698 | 0.028490 | 0.035851 | 0.021454 | 0.054398 | 0.054786 | 0.060894 | 0.086814 | 0.022301 | 0.005163 | 0.042496 |
| OCCUPATION_TYPE | 0.031637 | 0.032516 | 0.046282 | 0.057169 | 1.000000 | 0.020719 | 0.061912 | 0.256621 | 0.358815 | 0.289628 | 0.090566 | 0.044058 | 0.188272 | 0.049178 | 0.018078 | 0.081136 |
| NAME_TYPE_SUITE | 0.016696 | 0.014038 | 0.019799 | 0.025698 | 0.020719 | 1.000000 | 0.029978 | 0.042797 | 0.044520 | 0.020343 | 0.061945 | 0.019291 | 0.025099 | 0.073248 | 0.016833 | 0.009675 |
| NAME_CONTRACT_TYPE | 0.023575 | 0.029114 | 0.028316 | 0.028490 | 0.061912 | 0.029978 | 0.999976 | 0.005527 | 0.014329 | 0.061772 | 0.047759 | 0.027405 | 0.067877 | 0.068083 | 0.015118 | 0.031115 |
| FLAG_OWN_CAR | 0.015057 | 0.034645 | 0.033259 | 0.035851 | 0.256621 | 0.042797 | 0.005527 | 0.999991 | 0.345930 | 0.156379 | 0.167302 | 0.039645 | 0.097645 | 0.000509 | 0.003590 | 0.021341 |
| CODE_GENDER | 0.012622 | 0.020289 | 0.019805 | 0.021454 | 0.358815 | 0.044520 | 0.014329 | 0.345930 | 1.000000 | 0.119663 | 0.118149 | 0.047076 | 0.018635 | 0.043935 | 0.004841 | 0.055814 |
| NAME_INCOME_TYPE | 0.028740 | 0.031086 | 0.043217 | 0.054398 | 0.289628 | 0.020343 | 0.061772 | 0.156379 | 0.119663 | 1.000000 | 0.112233 | 0.054499 | 0.103975 | 0.072251 | 0.012171 | 0.063505 |
| NAME_FAMILY_STATUS | 0.024369 | 0.033476 | 0.043550 | 0.054786 | 0.090566 | 0.061945 | 0.047759 | 0.167302 | 0.118149 | 0.112233 | 1.000000 | 0.067538 | 0.052131 | 0.051105 | 0.003140 | 0.039752 |
| NAME_HOUSING_TYPE | 0.031720 | 0.043840 | 0.045783 | 0.060894 | 0.044058 | 0.019291 | 0.027405 | 0.039645 | 0.047076 | 0.054499 | 0.067538 | 1.000000 | 0.042252 | 0.226447 | 0.003215 | 0.037953 |
| NAME_EDUCATION_TYPE | 0.044419 | 0.063345 | 0.068319 | 0.086814 | 0.188272 | 0.025099 | 0.067877 | 0.097645 | 0.018635 | 0.103975 | 0.052131 | 0.042252 | 1.000000 | 0.030644 | 0.005185 | 0.057593 |
| FLAG_OWN_REALTY | 0.017970 | 0.029822 | 0.023046 | 0.022301 | 0.049178 | 0.073248 | 0.068083 | 0.000509 | 0.043935 | 0.072251 | 0.051105 | 0.226447 | 0.030644 | 0.999990 | 0.024692 | 0.006420 |
| WEEKDAY_APPR_PROCESS_START | 0.005678 | 0.003958 | 0.002006 | 0.005163 | 0.018078 | 0.016833 | 0.015118 | 0.003590 | 0.004841 | 0.012171 | 0.003140 | 0.003215 | 0.005185 | 0.024692 | 1.000000 | 0.004558 |
| TARGET | 0.031948 | 0.044376 | 0.040940 | 0.042496 | 0.081136 | 0.009675 | 0.031115 | 0.021341 | 0.055814 | 0.063505 | 0.039752 | 0.037953 | 0.057593 | 0.006420 | 0.004558 | 0.999973 |
plt.figure(figsize=(15,8))
sns.heatmap(corr_cats, annot=True, fmt='.3f', cmap='YlGnBu')
plt.title('Cramers V Matrix', fontdict={'size':'17'})
plt.show()
warnings.filterwarnings("ignore")
corr_bool = f_aux.corr_cat_boolean(df_loan_train[df_loan_bool])
corr_bool
| REG_REGION_NOT_LIVE_REGION | FLAG_MOBIL | FLAG_EMP_PHONE | FLAG_WORK_PHONE | FLAG_CONT_MOBILE | TARGET | LIVE_REGION_NOT_WORK_REGION | FLAG_EMAIL | FLAG_PHONE | REG_CITY_NOT_LIVE_CITY | REG_CITY_NOT_WORK_CITY | LIVE_CITY_NOT_WORK_CITY | REG_REGION_NOT_WORK_REGION | FLAG_DOCUMENT_4 | FLAG_DOCUMENT_5 | FLAG_DOCUMENT_2 | FLAG_DOCUMENT_3 | FLAG_DOCUMENT_11 | FLAG_DOCUMENT_10 | FLAG_DOCUMENT_9 | FLAG_DOCUMENT_8 | FLAG_DOCUMENT_7 | FLAG_DOCUMENT_6 | FLAG_DOCUMENT_12 | FLAG_DOCUMENT_13 | FLAG_DOCUMENT_19 | FLAG_DOCUMENT_18 | FLAG_DOCUMENT_17 | FLAG_DOCUMENT_16 | FLAG_DOCUMENT_15 | FLAG_DOCUMENT_14 | FLAG_DOCUMENT_20 | FLAG_DOCUMENT_21 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| REG_REGION_NOT_LIVE_REGION | 0.999866 | 0.000000 | 0.037046 | 0.064987 | 0.000000 | 0.004242 | 0.090931 | 0.018803 | 0.002026 | 0.339547 | 0.142506 | 0.010829 | 0.452122 | 0.000000 | 0.011142 | 0.000000 | 0.033288 | 0.105901 | 0.001342 | 0.017142 | 0.023536 | 0.000000 | 0.023963 | 0.000000 | 0.002580 | 0.000000 | 0.008837 | 0.000000 | 0.005940 | 0.000000 | 0.003106 | 0.000610 | 0.001759 |
| FLAG_MOBIL | 0.000000 | 0.499995 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.002335 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.010736 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_EMP_PHONE | 0.037046 | 0.000000 | 0.999986 | 0.233843 | 0.011449 | 0.046043 | 0.096618 | 0.062399 | 0.014936 | 0.092256 | 0.255917 | 0.218957 | 0.108618 | 0.000000 | 0.018527 | 0.001192 | 0.248955 | 0.029219 | 0.000000 | 0.023234 | 0.122020 | 0.000000 | 0.597988 | 0.000000 | 0.026118 | 0.009385 | 0.040904 | 0.005971 | 0.042980 | 0.014717 | 0.023547 | 0.009698 | 0.007439 |
| FLAG_WORK_PHONE | 0.064987 | 0.000000 | 0.233843 | 0.999987 | 0.021613 | 0.027957 | 0.042017 | 0.012190 | 0.293571 | 0.045825 | 0.121108 | 0.110478 | 0.068964 | 0.003075 | 0.036079 | 0.000000 | 0.061083 | 0.123001 | 0.000598 | 0.007848 | 0.020747 | 0.000000 | 0.138504 | 0.000000 | 0.000000 | 0.011717 | 0.030950 | 0.000000 | 0.004875 | 0.008916 | 0.001766 | 0.000000 | 0.000000 |
| FLAG_CONT_MOBILE | 0.000000 | 0.000000 | 0.011449 | 0.021613 | 0.998932 | 0.000000 | 0.001495 | 0.006958 | 0.007272 | 0.000000 | 0.003522 | 0.003211 | 0.000000 | 0.000000 | 0.004577 | 0.000000 | 0.006392 | 0.000000 | 0.000000 | 0.008224 | 0.021809 | 0.001773 | 0.009241 | 0.000000 | 0.060600 | 0.004141 | 0.040618 | 0.014093 | 0.030104 | 0.012868 | 0.075213 | 0.000000 | 0.007145 |
| TARGET | 0.004242 | 0.000000 | 0.046043 | 0.027957 | 0.000000 | 0.999973 | 0.001437 | 0.002944 | 0.022757 | 0.046275 | 0.051897 | 0.032252 | 0.006048 | 0.000000 | 0.000000 | 0.005469 | 0.044135 | 0.003865 | 0.000000 | 0.003826 | 0.008008 | 0.000000 | 0.028538 | 0.000000 | 0.011950 | 0.000000 | 0.007022 | 0.002479 | 0.010595 | 0.005639 | 0.008764 | 0.000000 | 0.001196 |
| LIVE_REGION_NOT_WORK_REGION | 0.090931 | 0.000000 | 0.096618 | 0.042017 | 0.001495 | 0.001437 | 0.999948 | 0.024498 | 0.005080 | 0.023564 | 0.186336 | 0.237686 | 0.858512 | 0.000000 | 0.014603 | 0.000000 | 0.011904 | 0.005963 | 0.000000 | 0.015080 | 0.061096 | 0.000000 | 0.059169 | 0.000000 | 0.015899 | 0.000000 | 0.003540 | 0.000000 | 0.003389 | 0.004232 | 0.013542 | 0.000000 | 0.000000 |
| FLAG_EMAIL | 0.018803 | 0.000000 | 0.062399 | 0.012190 | 0.006958 | 0.002944 | 0.024498 | 0.999962 | 0.014745 | 0.014999 | 0.003383 | 0.003754 | 0.029056 | 0.000000 | 0.000000 | 0.001125 | 0.011470 | 0.004342 | 0.000000 | 0.009699 | 0.030388 | 0.000000 | 0.041923 | 0.000000 | 0.002992 | 0.000000 | 0.008508 | 0.000000 | 0.012151 | 0.002711 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_PHONE | 0.002026 | 0.000000 | 0.014936 | 0.293571 | 0.007272 | 0.022757 | 0.005080 | 0.014745 | 0.999990 | 0.048395 | 0.045366 | 0.022972 | 0.003957 | 0.003022 | 0.074963 | 0.000000 | 0.007085 | 0.002399 | 0.003787 | 0.012624 | 0.004518 | 0.013119 | 0.008106 | 0.000000 | 0.006715 | 0.009186 | 0.003766 | 0.002557 | 0.009805 | 0.009215 | 0.009131 | 0.000000 | 0.000000 |
| REG_CITY_NOT_LIVE_CITY | 0.339547 | 0.000000 | 0.092256 | 0.045825 | 0.000000 | 0.046275 | 0.023564 | 0.014999 | 0.048395 | 0.999972 | 0.439962 | 0.029861 | 0.153122 | 0.000000 | 0.000000 | 0.000000 | 0.003458 | 0.056019 | 0.001801 | 0.005296 | 0.018503 | 0.000000 | 0.058091 | 0.000000 | 0.000000 | 0.004594 | 0.012980 | 0.000672 | 0.011521 | 0.000000 | 0.003654 | 0.000000 | 0.000000 |
| REG_CITY_NOT_WORK_CITY | 0.142506 | 0.000000 | 0.255917 | 0.121108 | 0.003522 | 0.051897 | 0.186336 | 0.003383 | 0.045366 | 0.439962 | 0.999989 | 0.825180 | 0.240299 | 0.000000 | 0.012870 | 0.000000 | 0.056723 | 0.033837 | 0.000000 | 0.000000 | 0.042682 | 0.000000 | 0.157524 | 0.000000 | 0.000000 | 0.002283 | 0.013347 | 0.000000 | 0.001868 | 0.000000 | 0.005424 | 0.000000 | 0.000951 |
| LIVE_CITY_NOT_WORK_CITY | 0.010829 | 0.000000 | 0.218957 | 0.110478 | 0.003211 | 0.032252 | 0.237686 | 0.003754 | 0.022972 | 0.029861 | 0.825180 | 0.999986 | 0.197453 | 0.000720 | 0.015206 | 0.000000 | 0.053971 | 0.000806 | 0.000000 | 0.005202 | 0.042419 | 0.000000 | 0.133376 | 0.000000 | 0.000000 | 0.000000 | 0.005987 | 0.000000 | 0.003546 | 0.000000 | 0.004516 | 0.002527 | 0.002813 |
| REG_REGION_NOT_WORK_REGION | 0.452122 | 0.000000 | 0.108618 | 0.068964 | 0.000000 | 0.006048 | 0.858512 | 0.029056 | 0.003957 | 0.153122 | 0.240299 | 0.197453 | 0.999958 | 0.000000 | 0.017619 | 0.000000 | 0.021723 | 0.058898 | 0.000000 | 0.019851 | 0.060633 | 0.000000 | 0.066960 | 0.000000 | 0.011145 | 0.000000 | 0.007072 | 0.000000 | 0.000000 | 0.003020 | 0.013681 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_4 | 0.000000 | 0.000000 | 0.000000 | 0.003075 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.003022 | 0.000000 | 0.000000 | 0.000720 | 0.000000 | 0.972220 | 0.000000 | 0.000000 | 0.012711 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_5 | 0.011142 | 0.000000 | 0.018527 | 0.036079 | 0.004577 | 0.000000 | 0.014603 | 0.000000 | 0.074963 | 0.000000 | 0.012870 | 0.015206 | 0.017619 | 0.000000 | 0.999862 | 0.000000 | 0.192882 | 0.007178 | 0.000000 | 0.007170 | 0.036578 | 0.000000 | 0.038148 | 0.000000 | 0.006805 | 0.001212 | 0.010531 | 0.000000 | 0.011577 | 0.003298 | 0.006119 | 0.000366 | 0.000000 |
| FLAG_DOCUMENT_2 | 0.000000 | 0.000000 | 0.001192 | 0.000000 | 0.000000 | 0.005469 | 0.000000 | 0.001125 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.954543 | 0.009591 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_3 | 0.033288 | 0.000000 | 0.248955 | 0.061083 | 0.006392 | 0.044135 | 0.011904 | 0.011470 | 0.007085 | 0.003458 | 0.056723 | 0.053971 | 0.021723 | 0.012711 | 0.192882 | 0.009591 | 0.999990 | 0.093042 | 0.007237 | 0.097963 | 0.466289 | 0.021690 | 0.486185 | 0.000000 | 0.020905 | 0.009128 | 0.008287 | 0.000000 | 0.032886 | 0.000000 | 0.000000 | 0.007662 | 0.022773 |
| FLAG_DOCUMENT_11 | 0.105901 | 0.000000 | 0.029219 | 0.123001 | 0.000000 | 0.003865 | 0.005963 | 0.004342 | 0.002399 | 0.056019 | 0.033837 | 0.000806 | 0.058898 | 0.000000 | 0.007178 | 0.000000 | 0.093042 | 0.999479 | 0.000000 | 0.002743 | 0.017243 | 0.000000 | 0.018783 | 0.000000 | 0.002496 | 0.000000 | 0.004986 | 0.000000 | 0.005553 | 0.000000 | 0.001997 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_10 | 0.001342 | 0.000000 | 0.000000 | 0.000598 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.003787 | 0.001801 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.007237 | 0.000000 | 0.928569 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_9 | 0.017142 | 0.000000 | 0.023234 | 0.007848 | 0.008224 | 0.003826 | 0.015080 | 0.009699 | 0.012624 | 0.005296 | 0.000000 | 0.005202 | 0.019851 | 0.000000 | 0.007170 | 0.000000 | 0.097963 | 0.002743 | 0.000000 | 0.999478 | 0.018421 | 0.000000 | 0.019225 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.004882 | 0.007729 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_8 | 0.023536 | 0.002335 | 0.122020 | 0.020747 | 0.021809 | 0.008008 | 0.061096 | 0.030388 | 0.004518 | 0.018503 | 0.042682 | 0.042419 | 0.060633 | 0.000000 | 0.036578 | 0.000000 | 0.466289 | 0.017243 | 0.000000 | 0.018421 | 0.999973 | 0.003075 | 0.092426 | 0.000000 | 0.078396 | 0.000000 | 0.006243 | 0.004945 | 0.012813 | 0.022567 | 0.031373 | 0.000671 | 0.002117 |
| FLAG_DOCUMENT_7 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.001773 | 0.000000 | 0.000000 | 0.000000 | 0.013119 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.021690 | 0.000000 | 0.000000 | 0.000000 | 0.003075 | 0.989794 | 0.003308 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_6 | 0.023963 | 0.000000 | 0.597988 | 0.138504 | 0.009241 | 0.028538 | 0.059169 | 0.041923 | 0.008106 | 0.058091 | 0.157524 | 0.133376 | 0.066960 | 0.000000 | 0.038148 | 0.000000 | 0.486185 | 0.018783 | 0.000000 | 0.019225 | 0.092426 | 0.003308 | 0.999975 | 0.000000 | 0.017395 | 0.004596 | 0.024457 | 0.002826 | 0.026379 | 0.009714 | 0.014107 | 0.005056 | 0.004459 |
| FLAG_DOCUMENT_12 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.499995 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| FLAG_DOCUMENT_13 | 0.002580 | 0.000000 | 0.026118 | 0.000000 | 0.060600 | 0.011950 | 0.015899 | 0.002992 | 0.006715 | 0.000000 | 0.000000 | 0.000000 | 0.011145 | 0.000000 | 0.006805 | 0.000000 | 0.020905 | 0.002496 | 0.000000 | 0.000000 | 0.078396 | 0.000000 | 0.017395 | 0.000000 | 0.999429 | 0.000000 | 0.004689 | 0.000000 | 0.005238 | 0.000000 | 0.001730 | 0.033210 | 0.004520 |
| FLAG_DOCUMENT_19 | 0.000000 | 0.000000 | 0.009385 | 0.011717 | 0.004141 | 0.000000 | 0.000000 | 0.000000 | 0.009186 | 0.004594 | 0.002283 | 0.000000 | 0.000000 | 0.000000 | 0.001212 | 0.000000 | 0.009128 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.004596 | 0.000000 | 0.000000 | 0.996642 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.039555 | 0.000000 |
| FLAG_DOCUMENT_18 | 0.008837 | 0.010736 | 0.040904 | 0.030950 | 0.040618 | 0.007022 | 0.003540 | 0.008508 | 0.003766 | 0.012980 | 0.013347 | 0.005987 | 0.007072 | 0.000000 | 0.010531 | 0.000000 | 0.008287 | 0.004986 | 0.000000 | 0.000000 | 0.006243 | 0.000000 | 0.024457 | 0.000000 | 0.004689 | 0.000000 | 0.999753 | 0.000000 | 0.008647 | 0.001622 | 0.004138 | 0.086002 | 0.001231 |
| FLAG_DOCUMENT_17 | 0.000000 | 0.000000 | 0.005971 | 0.000000 | 0.014093 | 0.002479 | 0.000000 | 0.000000 | 0.002557 | 0.000672 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.004882 | 0.004945 | 0.000000 | 0.002826 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.991665 | 0.000000 | 0.000000 | 0.000000 | 0.028338 | 0.000000 |
| FLAG_DOCUMENT_16 | 0.005940 | 0.000000 | 0.042980 | 0.004875 | 0.030104 | 0.010595 | 0.003389 | 0.012151 | 0.009805 | 0.011521 | 0.001868 | 0.003546 | 0.000000 | 0.000000 | 0.011577 | 0.000000 | 0.032886 | 0.005553 | 0.000000 | 0.007729 | 0.012813 | 0.000000 | 0.026379 | 0.000000 | 0.005238 | 0.000000 | 0.008647 | 0.000000 | 0.999791 | 0.002112 | 0.004655 | 0.080686 | 0.000000 |
| FLAG_DOCUMENT_15 | 0.000000 | 0.000000 | 0.014717 | 0.008916 | 0.012868 | 0.005639 | 0.004232 | 0.002711 | 0.009215 | 0.000000 | 0.000000 | 0.000000 | 0.003020 | 0.000000 | 0.003298 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.022567 | 0.000000 | 0.009714 | 0.000000 | 0.000000 | 0.000000 | 0.001622 | 0.000000 | 0.002112 | 0.998359 | 0.000000 | 0.027209 | 0.000000 |
| FLAG_DOCUMENT_14 | 0.003106 | 0.000000 | 0.023547 | 0.001766 | 0.075213 | 0.008764 | 0.013542 | 0.000000 | 0.009131 | 0.003654 | 0.005424 | 0.004516 | 0.013681 | 0.000000 | 0.006119 | 0.000000 | 0.000000 | 0.001997 | 0.000000 | 0.000000 | 0.031373 | 0.000000 | 0.014107 | 0.000000 | 0.001730 | 0.000000 | 0.004138 | 0.000000 | 0.004655 | 0.000000 | 0.999319 | 0.023345 | 0.000000 |
| FLAG_DOCUMENT_20 | 0.000610 | 0.000000 | 0.009698 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.002527 | 0.000000 | 0.000000 | 0.000366 | 0.000000 | 0.007662 | 0.000000 | 0.000000 | 0.000000 | 0.000671 | 0.000000 | 0.005056 | 0.000000 | 0.033210 | 0.039555 | 0.086002 | 0.028338 | 0.080686 | 0.027209 | 0.023345 | 0.996030 | 0.004427 |
| FLAG_DOCUMENT_21 | 0.001759 | 0.000000 | 0.007439 | 0.000000 | 0.007145 | 0.001196 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000951 | 0.002813 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.022773 | 0.000000 | 0.000000 | 0.000000 | 0.002117 | 0.000000 | 0.004459 | 0.000000 | 0.004520 | 0.000000 | 0.001231 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.004427 | 0.993053 |
plt.figure(figsize=(30,15))
sns.heatmap(corr_bool, annot=True, fmt='.3f', cmap='YlGnBu')
plt.title('Cramers V Matrix', fontdict={'size':'17'})
plt.show()
Si bien no se observan correlaciones muy altas de las diferentes variables categóricas y booleanas con nuestra variable target, la variable que tiene la correlación más alta es OCCUPATION_TYPE, que comentamos anteriormente en en análisis gráfico. Esta variable presenta una correlación del 8%, aunque no es mucho si que podría tener importancia en el modelo.
Destacar correlaciones entre el 30% y el 70% entre variables como pueden ser el tipo de vivienda y sus materiales de construcción, además de las características de las viviendas. Esta alta relación no es preocupante ya que se trata de una relación lógica.
También observar una correlación del 42.3% entre el nombre del puesto de trabajo que ocupa el cliente y el tipo de empresa en la que trabaja. A priori también una relación normal y no preocupante.
Weight of Evidence (WoE) e Information Value (IV)¶
El WoE es una medida que transforma una variable categórica o continua en una escala que refleja la relación entre las probabilidades de los dos grupos de la variable dependiente (por ejemplo, "fraude" y "no fraude"). Se calcula de la siguiente manera:
$$ WoE = ln (Distribución de la clase positiva/Distribución de la clase negativa) $$
Interpretación:
- Si WoE > 0, la categoría tiene una mayor proporción de positivos (indicando un buen predictor para la clase positiva).
- Si WoE < 0, la categoría tiene una mayor proporción de negativos (indicando un buen predictor para la clase negativa).
- WoE = 0 indica que la categoría tiene una distribución balanceada entre positivos y negativos, lo que no aporta mucha información.
¶
El Information Value (IV) es una métrica que ayuda a cuantificar la capacidad predictiva de una variable con respecto a la variable objetivo (target). Es una medida acumulada de las diferencias entre las proporciones de positivos y negativos en cada grupo.
El IV se calcula sumando los valores de WoE ponderados por la diferencia entre las proporciones de positivos y negativos en cada grupo:
$$ IV = ∑(Proporción de la clase positiva − Proporción de la clase negativa) × WoE $$
Interpretación del IV:
- IV < 0.02: Baja capacidad predictiva.
- 0.02 < IV < 0.1: Capacidad predictiva débil.
- 0.1 < IV < 0.3: Capacidad predictiva moderada.
- 0.3 < IV < 0.5: Alta capacidad predictiva.
- IV > 0.5: Muy alta capacidad predictiva (aunque se debe tener precaución de no sobreajustar el modelo).
A continuación vamos a calcular el WOE y el IV para algunas variables categóricas que me parecen interesantes. De las que posteriormente comentaremos las conclusiones.
woe_dict, iv = f_aux.calculate_woe_iv_categorical(df=df_loan_train, variable='OCCUPATION_TYPE', target='TARGET')
print("WoE por categoría:", woe_dict)
print("IV de la variable:", iv)
WoE por categoría: {'Accountants': np.float64(-0.5400057046334321), 'Cleaning staff': np.float64(0.20125383936768831), 'Cooking staff': np.float64(0.317780773417085), 'Core staff': np.float64(-0.2541654427300936), 'Drivers': np.float64(0.3876294694822735), 'HR staff': np.float64(-0.3778470566427261), 'High skill tech staff': np.float64(-0.29303383912994113), 'IT staff': np.float64(-0.11435561393858834), 'Laborers': np.float64(0.29302322513685075), 'Low-skill Laborers': np.float64(0.8619870550944092), 'Managers': np.float64(-0.3019227627931952), 'Medicine staff': np.float64(-0.2105095609540891), 'Private service staff': np.float64(-0.2755682075223078), 'Realty agents': np.float64(-0.039935644637323194), 'Sales staff': np.float64(0.20619282273186823), 'Secretaries': np.float64(-0.14368624748642267), 'Security staff': np.float64(0.33001694046914937), 'Waiters/barmen staff': np.float64(0.33623333358368496), 'Desconocido': np.float64(-0.24043809051046666)}
IV de la variable: 0.08587967416283065
woe_dict, iv = f_aux.calculate_woe_iv_categorical(df=df_loan_train, variable='NAME_INCOME_TYPE', target='TARGET')
print("WoE por categoría:", woe_dict)
print("IV de la variable:", iv)
WoE por categoría: {'Businessman': 0, 'Commercial associate': np.float64(-0.08014465465518467), 'Maternity leave': np.float64(2.4324819935799025), 'Pensioner': np.float64(-0.43494908113169145), 'State servant': np.float64(-0.35974790110068705), 'Student': 0, 'Unemployed': np.float64(2.027016885471738), 'Working': np.float64(0.1878468975898912), 'Desconocido': 0}
IV de la variable: 0.05808599223176106
woe_dict, iv = f_aux.calculate_woe_iv_categorical(df=df_loan_train, variable='NAME_EDUCATION_TYPE', target='TARGET')
print("WoE por categoría:", woe_dict)
print("IV de la variable:", iv)
WoE por categoría: {'Academic degree': np.float64(-2.4503199290064686), 'Higher education': np.float64(-0.4393091653969691), 'Incomplete higher': np.float64(0.05657720465626268), 'Lower secondary': np.float64(0.3385501370579372), 'Secondary / secondary special': np.float64(0.11163766773089984), 'Desconocido': 0}
IV de la variable: 0.05154040418506241
woe_dict, iv = f_aux.calculate_woe_iv_categorical(df=df_loan_train, variable='CODE_GENDER', target='TARGET')
print("WoE por categoría:", woe_dict)
print("IV de la variable:", iv)
WoE por categoría: {'F': np.float64(-0.1579356962666567), 'M': np.float64(0.2556181541980332), 'XNA': 0, 'Desconocido': 0}
IV de la variable: 0.040237003552605975
Voy a comentar mis conclusiones de las 4 variables analizadas:
En la variable 'OCCUPATION_TYPE' se observa como en trabajos menos cualificados el coeficiente WoE es positivo, es decir, cuanto mayor sea el coeficiente, mayor proporción de 1 en TARGET tendrán este tipo de trabajos. Por tanto, los clientes con trabajos poco cualificados como 'low-skill laborers', 'Drivers', 'Security Staff' o 'Waiters' muestran mayor proporción de 1 en TARGET (dificultad de pago). A su vez, clientes con trabajos más cualificados tienen coeficientes negativos, que supone que la categoría tiene una mayor proporción de clientes con TARGET = 0.
En la variable 'NAME_INCOME_TYPE' observamos como 'Unemployed' y 'Maternity leave' tienen un gran coeficiente positivo, por lo que son buenos predictores para TARGET = 1 (dificultad de pago). Por otro lado, 'Pensioner' y 'State servant' tienen coeficientes negativos, que supone que la categoría tiene una mayor proporción de clientes con TARGET = 0. 'Businessman' tiene un valor de 0, lo que significa que la categoría tiene una distribución balanceada entre positivos y negativos
En la variable 'EDUCATION_TYPE' los clientes con mejor educación tienen coeficientes negativos y los clientes de menor educación tienen coeficientes positivos. En principio, es algo lógico.
La variable 'CODE_GENDER' me parece interesante, pues los hombres 'M' tienen mayor coeficiente que las mujeres 'F', por tanto, a priori la mayoría de la proporción de TARGET = 1 (dificultad de pago) se corresponde a clientes varones.
¶
Como conclusión acerca del IV, observamos que todos los valores se encuentran en el intervalo 0.02 < IV < 0.1, por tanto, las variables presentan una capacidad predictiva débil. Esto ocurre ya que es necesario combinar varias variables para forjar una capacidad predictiva fuerte, si una única variable tuviera mucho poder predictivo sobre la variable objetivo podría generar problemas de multicolinealidad, overfitting o sesgo.
Exportación de datasets¶
print(df_loan_train.shape, df_loan_test.shape)
(246008, 122) (61503, 122)
df_loan_train.to_csv('../../data_loan_status/data_split/df_loan_train.csv', index=False)
df_loan_test.to_csv('../../data_loan_status/data_split/df_loan_test.csv', index=False)
Conclusiones EDA¶
Como hipótesis inicial y respondiendo a la pregunta planteada para la práctica ¿Hay algún tipo de clientes más propenso a no devolver un préstamo? Según nuestro análisis exploratorio de los datos podríamos deducir que tipo de cliente sería más propenso a no devolver un préstamo. Destacar que este perfilado de clientes es una hipótesis propia realizada bajo mi criterio según los valores estadísticos visualizados en el EDA, que podremos contrastar cuando realicemos el Feature engineering y el modelado. En esa parte de la práctica volveremos a comentar si rechazamos o no rechazamos la hipótesis nula aqui planteada.
Según el análisis exploratorio de los datos realizados en los 2 primeros notebooks, podemos intuir que el tipo de cliente que tendrá dificultades a la hora de pagar o devolver el préstamo de manera completa será:
- Un cliente con una baja educación
- Que tenga un coche antiguo
- Un trabajo cualificadamente bajo
- Que tenga una vivienda construida con materiales pobres, especialmente madera.
- Una familia grande con mas de 2 hijos
- Que esté desempleado o de baja
Posteriormente en la realización del feature engineering y del modelado verificaremos si la hipótesis inicial planteada según mi criterio bajo la interpretación de los estadísticos realizados y visualizados se cumple.
En la realización de este análisis exploratorio de los datos hemos aprendido:
- Entendimiento profundo de nuestros datos y de la problemática de negocio.
- La importación de nuestros datos, dimensiones de los mismos, división y reconocimiento de las diferentes categorías aportando una visualización de las mismas.
- Detección, graficado y análisis de nuestra variable objetivo. Concluyendo que presentaba un claro desbalanceo.
- Separación de nuestro dataset en train y test de manera estratificada debido al desbalanceo de nuestra variable objetivo.
- Visualización descriptiva de nuestras variables, pudiendo comprender su naturaleza, distribución e importancia en la variable objetivo.
- Tratamiento de valores atípicos (outliers), comprendiendo la importancia de los mismos y la repercusión que pudieran tener en la fase de modelado.
- Tratamiento de valores nulos, en todas las categorías de los datos (numéricos, booleanos y categóricos), aprendiendo y reflexionando sobre las diferentes métricas de imputación de valores nulos. Observando como afectan a la distribución y a la descripción estadística de nuestras variables.
- Análisis de correlación de las variables, pudiendo comprender como afecta una alta correlación en nuestra variable objetivo.
Todo esto nos permitió comprender que trabajamos con un Dataset que contiene muchas variables de diferentes tipos, con las cuales buscamos explicar y predecir el comportamiento de nuestra variable objetivo, es decir, cuando un cliente puede llegar a tener dificultades en el pago de un préstamo.
Con estas conclusiones, tenemos un problema complejo por delante que supondrá un gran reto desde el punto de vista del éxito de nuestros modelos, debido a que el modelo más simple de todos sería decir que pocos clientes tendrían dificultades en el pago del préstamo, ya que sólo tendríamos error en el 8.07% de las veces. El objetivo será intentar mejorar ese porcentaje agregando complejidad a nuestro análisis.
Cosas a tener en cuenta a la hora de ejecutar modelos:¶
- Podría ser necesario balancear el modelo, con técnicas de oversampling
- Hay variables que identificamos como importantes para predecir la dificultad de pago, como OCCUPATION_TYPE (puesto de trabajo), NAME_EDUCATION_TYPE (tipo de educación), NAME_INCOME_TYPE (pensionista, estudiante, trabajador), CNT_CHILDREN (tamaño de la familia), entre otras.
- Posibilidad de realizar un Mean Encoding en vez de One-Hot encoding para variables categóricas que presenten muchas categorías.